Hypothesis Testing

Type 2 Error Overview & Example

What is a Type 2 Error?

A type 2 error (AKA Type II error) occurs when you fail to reject a false null hypothesis in a hypothesis test. In other words, a statistically non-significant test result indicates that a population effect does not exist when it actually does. A type 2 error is a false negative because the effect exists in the population, but the test doesn’t detect it in the sample. [Read more…] about Type 2 Error Overview & Example

Type 1 Error Overview & Example

By Jim Frost Leave a Comment

What is a Type 1 Error?

A type 1 error (AKA Type I error) occurs when you reject a true null hypothesis in a hypothesis test. In other words, a statistically significant test result indicates that a population effect exists when it does not. A type 1 error is a false positive because the test detects an effect in the sample that doesn’t exist in the population. [Read more…] about Type 1 Error Overview & Example

One Sample T Test: Definition, Using & Example

By Jim Frost Leave a Comment

What is a One Sample T Test?

Use a one sample t test to evaluate a population mean using a single sample. Usually, you conduct this hypothesis test to determine whether a population mean differs from a hypothesized value you specify. The hypothesized value can be theoretically important in the study area, a reference value, or a target. [Read more…] about One Sample T Test: Definition, Using & Example

T Test Overview: How to Use & Examples

By Jim Frost 12 Comments

What is a T Test?

A t test is a statistical hypothesis test that assesses sample means to draw conclusions about population means. Frequently, analysts use a t test to determine whether the population means for two groups are different. For example, it can determine whether the difference between the treatment and control group means is statistically significant. [Read more…] about T Test Overview: How to Use & Examples

Wilcoxon Signed Rank Test Explained

By Jim Frost Leave a Comment

What is the Wilcoxon Signed Rank Test?

The Wilcoxon signed rank test is a nonparametric hypothesis test that can do the following:

Evaluate the median difference between two paired samples.
Compare a 1-sample median to a reference value.

What is P Hacking: Methods & Best Practices

By Jim Frost 2 Comments

P-Hacking Definition

P hacking is a set of statistical decisions and methodology choices during research that artificially produces statistically significant results. These decisions increase the probability of false positives—where the study indicates an effect exists when it actually does not. P-hacking is also known as data dredging, data fishing, and data snooping. [Read more…] about What is P Hacking: Methods & Best Practices

Kruskal Wallis Test Explained

By Jim Frost Leave a Comment

What is the Kruskal Wallis Test?

The Kruskal Wallis test is a nonparametric hypothesis test that compares three or more independent groups. Statisticians also refer to it as one-way ANOVA on ranks. This analysis extends the Mann Whitney U nonparametric test that can compare only two groups. [Read more…] about Kruskal Wallis Test Explained

What is the Bonferroni Correction and How to Use It

By Jim Frost 8 Comments

What is the Bonferroni Correction?

The Bonferroni correction adjusts your significance level to control the overall probability of a Type I error (false positive) for multiple hypothesis tests. [Read more…] about What is the Bonferroni Correction and How to Use It

Mann Whitney U Test Explained

By Jim Frost 8 Comments

What is the Mann Whitney U Test?

The Mann Whitney U test is a nonparametric hypothesis test that compares two independent groups. Statisticians also refer to it as the Wilcoxon rank sum test. The Kruskal Wallis test extends this analysis so that can compare more than two groups. [Read more…] about Mann Whitney U Test Explained

Fishers Exact Test: Using & Interpreting

By Jim Frost 9 Comments

Fishers exact test determines whether a statistically significant association exists between two categorical variables.

For example, does a relationship exist between gender (Male/Female) and voting Yes or No on a referendum? [Read more…] about Fishers Exact Test: Using & Interpreting

Z Test: Uses, Formula & Examples

By Jim Frost Leave a Comment

What is a Z Test?

Use a Z test when you need to compare group means. Use the 1-sample analysis to determine whether a population mean is different from a hypothesized value. Or use the 2-sample version to determine whether two population means differ. [Read more…] about Z Test: Uses, Formula & Examples

Statistical Significance: Definition & Meaning

By Jim Frost 5 Comments

What is Statistical Significance?

The Greek sympol of alpha, which represents the significance level. — Alpha represents the level of statistical significance.

Statistical significance is the goal for most researchers analyzing data. But what does statistically significant mean? Why and when is it important to consider? How do P values fit in with statistical significance? I’ll answer all these questions in this blog post!

Evaluate statistical significance when using a sample to estimate an effect in a population. It helps you determine whether your findings are the result of chance versus an actual effect of a variable of interest. [Read more…] about Statistical Significance: Definition & Meaning

Statistical Inference: Definition, Methods & Example

By Jim Frost 1 Comment

What is Statistical Inference?

Statistical inference is the process of using a sample to infer the properties of a population. Statistical procedures use sample data to estimate the characteristics of the whole population from which the sample was drawn.

Scientists typically want to learn about a population. When studying a phenomenon, such as the effects of a new medication or public opinion, understanding the results at a population level is much more valuable than understanding only the comparatively few participants in a study.

Unfortunately, populations are usually too large to measure fully. Consequently, researchers must use a manageable subset of that population to learn about it.

By using procedures that can make statistical inferences, you can estimate the properties and processes of a population. More specifically, sample statistics can estimate population parameters. Learn more about the differences between sample statistics and population parameters.

For example, imagine that you are studying a new medication. As a scientist, you’d like to understand the medicine’s effect in the entire population rather than just a small sample. After all, knowing the effect on a handful of people isn’t very helpful for the larger society!

Consequently, you are interested in making a statistical inference about the medicine’s effect in the population.

Read on to see how to do that! I’ll show you the general process for making a statistical inference and then cover an example using real data.

How to Make Statistical Inferences

In its simplest form, the process of making a statistical inference requires you to do the following:

Draw a sample that adequately represents the population.
Measure your variables of interest.
Use appropriate statistical methodology to generalize your sample results to the population while accounting for sampling error.

Of course, that’s the simple version. In real-world experiments, you might need to form treatment and control groups, administer treatments, and reduce other sources of variation. In more complex cases, you might need to create a model of a process. There are many details in the process of making a statistical inference! Learn how to incorporate statistical inference into scientific studies.

Statistical inference requires using specialized sampling methods that tend to produce representative samples. If the sample does not look like the larger population you’re studying, you can’t trust any inferences from the sample. Consequently, using an appropriate method to obtain your sample is crucial. The best sampling methods tend to produce samples that look like the target population. Learn more about Sampling Methods and Representative Samples.

After obtaining a representative sample, you’ll need to use a procedure that can make statistical inferences. While you might have a sample that looks similar to the population, it will never be identical to it. Statisticians refer to the differences between a sample and the population as sampling error. Any effect or relationship you see in your sample might actually be sampling error rather than a true finding. Inferential statistics incorporate sampling error into the results. Learn more about Sampling Error.

Common Inferential Methods

The following are four standard procedures than can make statistical inferences.

Hypothesis Testing: Uses representative samples to assess two mutually exclusive hypotheses about a population. Statistically significant results suggest that the sample effect or relationship exists in the population after accounting for sampling error.
Confidence Intervals: A range of values likely containing the population value. This procedure evaluates the sampling error and adds a margin around the estimate, giving an idea of how wrong it might be.
Margin of Error: Comparable to a confidence interval but usually for survey results.
Regression Modeling: An estimate of the process that generates the outcomes in the population.

Example Statistical Inference

Let’s look at a real flu vaccine study for an example of making a statistical inference. The scientists for this study want to evaluate whether a flu vaccine effectively reduces flu cases in the general population. However, the general population is much too large to include in their study, so they must use a representative sample to make a statistical inference about the vaccine’s effectiveness.

The Monto et al. study* evaluates the 2007-2008 flu season and follows its participants from January to April. Participants are 18-49 years old. They selected ~1100 participants and randomly assigned them to the vaccine and placebo groups. After tracking them for the flu season, they record the number of flu infections in each group, as shown below.

Treatment	Flu count	Group size	Percent infections
Placebo	35	325	10.8%
Vaccine	28	813	3.4%
Effect			7.4%

Monto Study Findings

From the table above, 10.8% of the unvaccinated got the flu, while only 3.4% of the vaccinated caught it. The apparent effect of the vaccine is 10.8% – 3.4% = 7.4%. While that seems to show a vaccine effect, it might be a fluke due to sampling error. We’re assessing only 1,100 people out of a population of millions. We need to use a hypothesis test and confidence interval (CI) to make a proper statistical inference.

While the details go beyond this introductory post, here are two statistical inferences we can make using a 2-sample proportions test and CI.

The p-value of the test is < 0.0005. The evidence strongly favors the hypothesis that the vaccine effectively reduces flu infections in the population after accounting for sampling error.
Additionally, the confidence interval for the effect size is 3.7% to 10.9%. Our study found a sample effect of 7.4%, but it is unlikely to equal the population effect exactly due to sampling error. The CI identifies a range that is likely to include the population effect.

For more information about this and other flu vaccine studies, read my post about Flu Vaccine Effectiveness.

In conclusion, by using a representative sample and the proper methodology, we made a statistical inference about vaccine effectiveness in an entire population.

Reference

Monto AS, Ohmit SE, Petrie JG, Johnson E, Truscon R, Teich E, Rotthoff J, Boulton M, Victor JC. Comparative efficacy of inactivated and live attenuated influenza vaccines. N Engl J Med. 2009;361(13):1260-7.

How to Find the P value: Process and Calculations

By Jim Frost 4 Comments

P values are everywhere in statistics. They’re in all types of hypothesis tests. But how do you calculate a p-value? Unsurprisingly, the precise calculations depend on the test. However, there is a general process that applies to finding a p value.

In this post, you’ll learn how to find the p value. I’ll start by showing you the general process for all hypothesis tests. Then I’ll move on to a step-by-step example showing the calculations for a p value. This post includes a calculator so you can apply what you learn. [Read more…] about How to Find the P value: Process and Calculations

What is Power in Statistics?

By Jim Frost 1 Comment

Power in statistics is the probability that a hypothesis test can detect an effect in a sample when it exists in the population. It is the sensitivity of a hypothesis test. When an effect exists in the population, how likely is the test to detect it in your sample? [Read more…] about What is Power in Statistics?

Chi-Square Goodness of Fit Test: Uses & Examples

By Jim Frost 6 Comments

What is the Chi Square Goodness of Fit Test?

The chi-square goodness of fit test evaluates whether proportions of categorical or discrete outcomes in a sample follow a population distribution with hypothesized proportions. In other words, when you draw a random sample, do the observed proportions follow the values that theory suggests. [Read more…] about Chi-Square Goodness of Fit Test: Uses & Examples

Sampling Error: Definition, Sources & Minimizing

By Jim Frost 7 Comments

What is Sampling Error?

Sampling error is the difference between a sample statistic and the population parameter it estimates. It is a crucial consideration in inferential statistics where you use a sample to estimate the properties of an entire population. [Read more…] about Sampling Error: Definition, Sources & Minimizing

Inter-Rater Reliability: Definition, Examples & Assessing

By Jim Frost Leave a Comment

What is Inter-Rater Reliability?

Inter-rater reliability measures the agreement between subjective ratings by multiple raters, inspectors, judges, or appraisers. It answers the question, is the rating system consistent? High inter-rater reliability indicates that multiple raters’ ratings for the same item are consistent. Conversely, low reliability means they are inconsistent. [Read more…] about Inter-Rater Reliability: Definition, Examples & Assessing

Margin of Error: Formula and Interpreting

By Jim Frost 2 Comments

What is the Margin of Error?

The margin of error (MOE) for a survey tells you how near you can expect the survey results to be to the correct population value. For example, a survey indicates that 72% of respondents favor Brand A over Brand B with a 3% margin of error. In this case, the actual population percentage that prefers Brand A likely falls within the range of 72% ± 3%, or 69 – 75%. [Read more…] about Margin of Error: Formula and Interpreting

Null Hypothesis: Definition, Rejecting & Examples

By Jim Frost 6 Comments

What is a Null Hypothesis?

The null hypothesis in statistics states that there is no difference between groups or no relationship between variables. It is one of two mutually exclusive hypotheses about a population in a hypothesis test. [Read more…] about Null Hypothesis: Definition, Rejecting & Examples