Missing data refers to the absence of data entries in a dataset where values are expected but not recorded. They’re the blank cells in your data sheet. Missing values for specific variables or participants can occur for many reasons, including incomplete data entry, equipment failures, or lost files. When data are missing, it’s a problem. However, the issues go beyond merely reducing the sample size. In some cases, they can skew your results. [Read more…] about Missing Data Overview: Types, Implications & Handling

# conceptual

## Data Aggregation: Strengths & Weaknesses of Aggregated Data

## What is Data Aggregation?

Data aggregation is a crucial process that involves collecting data and summarizing it in a concise form. This method transforms atomic data rows—sourced from diverse origins—into comprehensive totals or summary statistics. Aggregated data, typically housed in data warehouses, enhances analytical capabilities and significantly speeds up querying large datasets. [Read more…] about Data Aggregation: Strengths & Weaknesses of Aggregated Data

## Prospect Theory Overview & Examples

## What is Prospect Theory?

Prospect Theory states that individuals place greater weight on losses than gains while making decisions. It is a descriptive model of how individuals make decisions involving risk and uncertainty proposed by Daniel Kahneman and Amos Tversky in 1979. Prospect theory describes how people evaluate and choose between different options. [Read more…] about Prospect Theory Overview & Examples

## Regression to the Mean: Definition & Examples

## What is Regression to the Mean?

Regression to the mean is the statistical tendency for an extreme sample or observed value to be followed by a more average one. It is also known as reverting to the mean, highlighting the propensity for a later observation to move closer to the mean after an extreme value. The concept applies only to random variation in a process or system and does not pertain to interventions or events that affect the outcome. [Read more…] about Regression to the Mean: Definition & Examples

## Self Selection Bias Overview & Examples

## What is Self Selection Bias?

Self selection bias can occur when individuals choose to participate in a study, survey, or experiment. The bias exists when volunteers have different characteristics than those who do not participate. It is a form of sampling bias stemming from using a nonprobability sampling method, such as volunteer or convenience sampling. [Read more…] about Self Selection Bias Overview & Examples

## Attrition Bias: Definition & Examples

## What is Attrition Bias?

Attrition bias in research occurs when study participants who drop out have characteristics that differ significantly from those who remain. This selective dropout can lead to skewed results and misinterpretations if the researchers don’t adequately address it. This threat is higher for longitudinal studies and those with relatively high attrition rates. [Read more…] about Attrition Bias: Definition & Examples

## Conjunction Fallacy: Definition & Example

## What is the Conjunction Fallacy?

The conjunction fallacy is a cognitive bias that occurs when someone mistakenly believes that two events occurring together are more likely than either of the two events alone. In other words, it’s the mistaken belief that a precisely detailed, multifaced outcome is more likely to occur than a more generalized version of that outcome. [Read more…] about Conjunction Fallacy: Definition & Example

## Residual Sum of Squares (RSS) Explained

The residual sum of squares (RSS) measures the difference between your observed data and the model’s predictions. It is the portion of variability your regression model does not explain, also known as the model’s error. Use RSS to evaluate how well your model fits the data. [Read more…] about Residual Sum of Squares (RSS) Explained

## Covariance vs Correlation: Understanding the Differences

Covariance vs correlation both evaluate the linear relationship between two continuous variables. While this description makes them sound similar, there are stark differences in how to interpret them.

Although these statistics are closely related, they are distinct concepts. How are they different?

In this post, learn about the differences between covariance vs correlation and what you can learn from each. [Read more…] about Covariance vs Correlation: Understanding the Differences

## Risk Calculations: Relative vs Absolute & Risk Reduction

What’s the risk? People discuss risk frequently, but it’s not always clearly understood. It is your exposure to danger or adverse outcomes. Statistically, we define risk as the probability of a negative outcome occurring, and there are several ways to calculate it. [Read more…] about Risk Calculations: Relative vs Absolute & Risk Reduction

## Omitted Variable Bias: Definition, Avoiding & Example

## What is Omitted Variable Bias?

Omitted variable bias (OVB) occurs when a regression model excludes a relevant variable. The absence of these critical variables can skew the estimated relationships between variables in the model, potentially leading to erroneous interpretations. This bias can exaggerate, mask, or entirely flip the direction of the estimated relationship between an independent and dependent variable. [Read more…] about Omitted Variable Bias: Definition, Avoiding & Example

## Sample Mean vs Population Mean: Symbol & Formulas

In statistics, the symbols and formulas for basic concepts such as the mean provide a foundational understanding of data analysis. Understanding the mean involves more than just knowing how to calculate an average; it’s about recognizing the nuances that differentiate a population mean from a sample mean. This distinction is crucial in statistical analysis, as the approach and symbol used for each vary (mu vs. x bar). [Read more…] about Sample Mean vs Population Mean: Symbol & Formulas

## Type 2 Error Overview & Example

## What is a Type 2 Error?

A type 2 error (AKA Type II error) occurs when you fail to reject a false null hypothesis in a hypothesis test. In other words, a statistically non-significant test result indicates that a population effect does not exist when it actually does. A type 2 error is a false negative because the effect exists in the population, but the test doesn’t detect it in the sample. [Read more…] about Type 2 Error Overview & Example

## Type 1 Error Overview & Example

## What is a Type 1 Error?

A type 1 error (AKA Type I error) occurs when you reject a true null hypothesis in a hypothesis test. In other words, a statistically significant test result indicates that a population effect exists when it does not. A type 1 error is a false positive because the test detects an effect in the sample that doesn’t exist in the population. [Read more…] about Type 1 Error Overview & Example

## Correlation vs Causation: Understanding the Differences

Correlation vs causation in statistics is a critical distinction. And you’ve undoubtedly heard that correlation doesn’t imply causation. Why is that the case, what are the differences between them, and why do they matter? Those are the topics of this post! [Read more…] about Correlation vs Causation: Understanding the Differences

## Observational Study vs Experiment with Examples

## Comparing Observational Studies vs Experiments

Observational studies and experiments are two standard research methods for understanding the world. Both research designs collect data and use statistical analysis to understand relationships between variables. Beyond that commonality, they are vastly different and have dissimilar sets of pros and cons. [Read more…] about Observational Study vs Experiment with Examples

## Goodness of Fit: Definition & Tests

## What is Goodness of Fit?

Goodness of fit evaluates how well observed data align with the expected values from a statistical model. [Read more…] about Goodness of Fit: Definition & Tests

## Expected Value: Definition, Formula & Finding

## What is the Expected Value?

The expected value in statistics is the long-run average outcome of a random variable based on its possible outcomes and their respective probabilities. Essentially, if an experiment (like a game of chance) were repeated, the expected value tells us the average result we’d see in the long run. Statisticians denote it as E(X), where E is “expected value,” and X is the random variable. [Read more…] about Expected Value: Definition, Formula & Finding

## What is a Parsimonious Model? Benefits and Selecting

## What is a Parsimonious Model?

A parsimonious model in statistics is one that uses relatively few independent variables to obtain a good fit to the data. [Read more…] about What is a Parsimonious Model? Benefits and Selecting

## Placebo Effect Overview: Definition & Examples

## What is the Placebo Effect?

The placebo effect occurs when a fake medical treatment produces real medical benefits psychosomatically. In short, believing in the treatment and the power of the mind can help someone feel better. The placebo effect can be so powerful that it mimics genuine medicine. Consequently, scientists need to control for it when conducting clinical trials. [Read more…] about Placebo Effect Overview: Definition & Examples