• Skip to secondary menu
  • Skip to main content
  • Skip to primary sidebar
  • My Store
  • Glossary
  • Home
  • About Me
  • Contact Me

Statistics By Jim

Making statistics intuitive

  • Graphs
  • Basics
  • Hypothesis Testing
  • Regression
  • ANOVA
  • Probability
  • Time Series
  • Fun

Blog

Negative Binomial Distribution: Uses, Calculator & Formula

By Jim Frost Leave a Comment

What is a Negative Binomial Distribution?

The negative binomial distribution describes the number of trials required to generate an event a particular number of times. When you provide an event probability and the number of successes (r), this distribution calculates the likelihood of observing the Rth success on the Nth attempt. Statisticians also refer to this discrete probability distribution as the Pascal distribution. [Read more…] about Negative Binomial Distribution: Uses, Calculator & Formula

Filed Under: Probability Tagged With: conceptual, distributions, graphs

Benford’s Law Explained with Examples

By Jim Frost Leave a Comment

What is Benford’s Law?

Benford’s law describes the relative frequency distribution for leading digits of numbers in datasets. Leading digits with smaller values occur more frequently than larger values. This law states that approximately 30% of numbers start with a 1 while less than 5% start with a 9. According to this law, leading 1s appear 6.5 times as often as leading 9s! Benford’s law is also known as the First Digit Law. [Read more…] about Benford’s Law Explained with Examples

Filed Under: Probability Tagged With: distributions, Excel, graphs

Quota Sampling: Definition & Examples

By Jim Frost Leave a Comment

What is Quota Sampling?

Quota sampling is a non-random selection of subjects from population subgroups that the researchers define. Researchers use quota sampling when random sampling isn’t feasible, and they want more control over who they select compared to other non-probability methods, such as convenience sampling. [Read more…] about Quota Sampling: Definition & Examples

Filed Under: Basics Tagged With: conceptual, experimental design, sampling methods

Qualitative vs Quantitative Data Differences

By Jim Frost 2 Comments

Qualitative vs quantitative data is a fundamental distinction between two types of information you can gather and analyze statistically. These types of variables seem diametrically opposed, but effective research projects will use them together.

In this post, I’ll explain the difference between qualitative and quantitative data and show effective ways to graph and analyze them for your research. [Read more…] about Qualitative vs Quantitative Data Differences

Filed Under: Basics Tagged With: conceptual, data types

Hazard Ratio: Interpretation & Definition

By Jim Frost Leave a Comment

What are Hazard Ratios?

A hazard ratio (HR) is the probability of an event in a treatment group relative to the control group probability over a unit of time. This ratio is an effect size measure for time-to-event data. Use hazard ratios to estimate the treatment effect in clinical trials when you want to assess time-to-event.

For example, HRs can determine whether a medical treatment reduces the duration of symptoms or prolongs survival in cancer patients. [Read more…] about Hazard Ratio: Interpretation & Definition

Filed Under: Probability Tagged With: conceptual

Nominal Data: Definition & Examples

By Jim Frost Leave a Comment

What is Nominal Data?

Nominal data are divided into mutually exclusive categories that do not have a natural order, nor do they provide any quantitative information. The definition of nominal in statistics is “in name only.” This definition indicates how these data consist of category names—all you can do is name the group to which each observation belongs. Nominal and categorical data are synonyms, and I’ll use them interchangeably.

For example, literary genre is a nominal variable that can have the following categories: science fiction, drama, and comedy. [Read more…] about Nominal Data: Definition & Examples

Filed Under: Basics Tagged With: conceptual, data types

Linear Regression Equation Explained

By Jim Frost 2 Comments

A linear regression equation describes the relationship between the independent variables (IVs) and the dependent variable (DV). It can also predict new values of the DV for the IV values you specify. [Read more…] about Linear Regression Equation Explained

Filed Under: Regression Tagged With: analysis example, interpreting results

Relative Risk: Definition, Formula & Interpretation

By Jim Frost Leave a Comment

What is Relative Risk?

Relative risk is the ratio of the probability of an adverse outcome in an exposure group divided by its likelihood in an unexposed group. This statistic indicates whether exposure corresponds to increases, decreases, or no change in the probability of the adverse outcome. Use relative risk to measure the strength of the association between exposure and the outcome. Analysts also refer to this statistic as the risk ratio. [Read more…] about Relative Risk: Definition, Formula & Interpretation

Filed Under: Probability Tagged With: analysis example, interpreting results

Factor Analysis Guide with an Example

By Jim Frost 3 Comments

What is Factor Analysis?

Factor analysis uses the correlation structure amongst observed variables to model a smaller number of unobserved, latent variables known as factors. Researchers use this statistical method when subject-area knowledge suggests that latent factors cause observable variables to covary. Use factor analysis to identify the hidden variables. [Read more…] about Factor Analysis Guide with an Example

Filed Under: Basics Tagged With: analysis example, conceptual, interpreting results, multivariate

Ordinal Data: Definition, Examples & Analysis

By Jim Frost 4 Comments

What is Ordinal Data?

Ordinal data have at least three categories that have a natural rank order. The categories are ranked, but the differences between ranks may not be equal. These data indicate the order of values but not the degree of difference between them. For example, first, second, and third places in a race are ordinal data. You can clearly understand the order of finishes. However, the time difference between first and second place might not be the same as between second and third place. [Read more…] about Ordinal Data: Definition, Examples & Analysis

Filed Under: Basics Tagged With: data types

What is K Means Clustering? With an Example

By Jim Frost 10 Comments

What is K Means Clustering?

The K means clustering algorithm divides a set of n observations into k clusters. Use K means clustering when you don’t have existing group labels and want to assign similar data points to the number of groups you specify (K). [Read more…] about What is K Means Clustering? With an Example

Filed Under: Basics Tagged With: analysis example, interpreting results

Probability Density Function: Definition & Uses

By Jim Frost 12 Comments

What is a Probability Density Function (PDF)?

A probability density function describes a probability distribution for a random, continuous variable. Use a probability density function to find the chances that the value of a variable will occur within a range of values that you specify. More specifically, a PDF is a function where its integral for an interval provides the probability of a value occurring in that interval. For example, what are the chances that the next IQ score you measure will fall between 120 and 140? In statistics, PDF stands for probability density function. [Read more…] about Probability Density Function: Definition & Uses

Filed Under: Probability Tagged With: conceptual, distributions, graphs

Experimental Design: Definition and Types

By Jim Frost Leave a Comment

What is Experimental Design?

An experimental design is a detailed plan for collecting and using data to identify causal relationships. Through careful planning, the design of experiments allows your data collection efforts to have a reasonable chance of detecting effects and testing hypotheses that answer your research questions. [Read more…] about Experimental Design: Definition and Types

Filed Under: Basics Tagged With: experimental design

Cronbach’s Alpha: Definition, Calculations & Example

By Jim Frost 4 Comments

What is Cronbach’s Alpha?

Cronbach’s alpha coefficient measures the internal consistency, or reliability, of a set of survey items. Use this statistic to help determine whether a collection of items consistently measures the same characteristic. Cronbach’s alpha quantifies the level of agreement on a standardized 0 to 1 scale. Higher values indicate higher agreement between items. [Read more…] about Cronbach’s Alpha: Definition, Calculations & Example

Filed Under: Basics Tagged With: analysis example, conceptual, interpreting results

Cohens D: Definition, Using & Examples

By Jim Frost 2 Comments

What is Cohens d?

Cohens d is a standardized effect size for measuring the difference between two group means. Frequently, you’ll use it when you’re comparing a treatment to a control group. It can be a suitable effect size to include with t-test and ANOVA results. The field of psychology frequently uses Cohens d. [Read more…] about Cohens D: Definition, Using & Examples

Filed Under: Basics Tagged With: conceptual

Statistical Inference: Definition, Methods & Example

By Jim Frost 1 Comment

What is Statistical Inference?

Statistical inference is the process of using a sample to infer the properties of a population. Statistical procedures use sample data to estimate the characteristics of the whole population from which the sample was drawn.

Image of a scientist who wants to make a statistical inference.Scientists typically want to learn about a population. When studying a phenomenon, such as the effects of a new medication or public opinion, understanding the results at a population level is much more valuable than understanding only the comparatively few participants in a study.

Unfortunately, populations are usually too large to measure fully. Consequently, researchers must use a manageable subset of that population to learn about it.

By using procedures that can make statistical inferences, you can estimate the properties and processes of a population. More specifically, sample statistics can estimate population parameters. Learn more about the differences between sample statistics and population parameters.

For example, imagine that you are studying a new medication. As a scientist, you’d like to understand the medicine’s effect in the entire population rather than just a small sample. After all, knowing the effect on a handful of people isn’t very helpful for the larger society!

Consequently, you are interested in making a statistical inference about the medicine’s effect in the population.

Read on to see how to do that! I’ll show you the general process for making a statistical inference and then cover an example using real data.

Related post: Descriptive vs. Inferential Statistics

How to Make Statistical Inferences

In its simplest form, the process of making a statistical inference requires you to do the following:

  1. Draw a sample that adequately represents the population.
  2. Measure your variables of interest.
  3. Use appropriate statistical methodology to generalize your sample results to the population while accounting for sampling error.

Of course, that’s the simple version. In real-world experiments, you might need to form treatment and control groups, administer treatments, and reduce other sources of variation. In more complex cases, you might need to create a model of a process. There are many details in the process of making a statistical inference! Learn how to incorporate statistical inference into scientific studies.

Statistical inference requires using specialized sampling methods that tend to produce representative samples. If the sample does not look like the larger population you’re studying, you can’t trust any inferences from the sample. Consequently, using an appropriate method to obtain your sample is crucial. The best sampling methods tend to produce samples that look like the target population. Learn more about Sampling Methods and Representative Samples.

After obtaining a representative sample, you’ll need to use a procedure that can make statistical inferences. While you might have a sample that looks similar to the population, it will never be identical to it. Statisticians refer to the differences between a sample and the population as sampling error. Any effect or relationship you see in your sample might actually be sampling error rather than a true finding. Inferential statistics incorporate sampling error into the results. Learn more about Sampling Error.

Common Inferential Methods

The following are four standard procedures than can make statistical inferences.

  • Hypothesis Testing: Uses representative samples to assess two mutually exclusive hypotheses about a population. Statistically significant results suggest that the sample effect or relationship exists in the population after accounting for sampling error.
  • Confidence Intervals: A range of values likely containing the population value. This procedure evaluates the sampling error and adds a margin around the estimate, giving an idea of how wrong it might be.
  • Margin of Error: Comparable to a confidence interval but usually for survey results.
  • Regression Modeling: An estimate of the process that generates the outcomes in the population.

Example Statistical Inference

Let’s look at a real flu vaccine study for an example of making a statistical inference. The scientists for this study want to evaluate whether a flu vaccine effectively reduces flu cases in the general population. However, the general population is much too large to include in their study, so they must use a representative sample to make a statistical inference about the vaccine’s effectiveness.

The Monto et al. study* evaluates the 2007-2008 flu season and follows its participants from January to April. Participants are 18-49 years old. They selected ~1100 participants and randomly assigned them to the vaccine and placebo groups. After tracking them for the flu season, they record the number of flu infections in each group, as shown below.

Treatment Flu count Group size Percent infections
Placebo 35 325 10.8%
Vaccine 28 813 3.4%
Effect 7.4%

Monto Study Findings

From the table above, 10.8% of the unvaccinated got the flu, while only 3.4% of the vaccinated caught it. The apparent effect of the vaccine is 10.8% – 3.4% = 7.4%. While that seems to show a vaccine effect, it might be a fluke due to sampling error. We’re assessing only 1,100 people out of a population of millions. We need to use a hypothesis test and confidence interval (CI) to make a proper statistical inference.

While the details go beyond this introductory post, here are two statistical inferences we can make using a 2-sample proportions test and CI.

  1. The p-value of the test is < 0.0005. The evidence strongly favors the hypothesis that the vaccine effectively reduces flu infections in the population after accounting for sampling error.
  2. Additionally, the confidence interval for the effect size is 3.7% to 10.9%. Our study found a sample effect of 7.4%, but it is unlikely to equal the population effect exactly due to sampling error. The CI identifies a range that is likely to include the population effect.

For more information about this and other flu vaccine studies, read my post about Flu Vaccine Effectiveness.

In conclusion, by using a representative sample and the proper methodology, we made a statistical inference about vaccine effectiveness in an entire population.

Reference

Monto AS, Ohmit SE, Petrie JG, Johnson E, Truscon R, Teich E, Rotthoff J, Boulton M, Victor JC. Comparative efficacy of inactivated and live attenuated influenza vaccines. N Engl J Med. 2009;361(13):1260-7.

Filed Under: Hypothesis Testing Tagged With: analysis example, conceptual

T Distribution: Definition & Uses

By Jim Frost Leave a Comment

What is the T Distribution?

The t distribution is a continuous probability distribution that is symmetric and bell-shaped like the normal distribution but with a shorter peak and thicker tails. It was designed to factor in the greater uncertainty associated with small sample sizes.

The t distribution describes the variability of the distances between sample means and the population mean when the population standard deviation is unknown and the data approximately follow the normal distribution. This distribution has only one parameter, the degrees of freedom, based on (but not equal to) the sample size. [Read more…] about T Distribution: Definition & Uses

Filed Under: Probability Tagged With: conceptual, distributions, graphs

Representative Sample: Definition, Uses & Methods

By Jim Frost Leave a Comment

What is a Representative Sample?

A representative sample is one where the individuals in the sample reflect the properties of an entire population. Use a representative sample when you want to generalize the results from the sample to a population. By studying a representative sample, you can approximate the properties of the population from which it was drawn. [Read more…] about Representative Sample: Definition, Uses & Methods

Filed Under: Basics Tagged With: conceptual, experimental design, sampling methods

Difference Between Standard Deviation and Standard Error

By Jim Frost 6 Comments

The difference between a standard deviation and a standard error can seem murky. Let’s clear that up in this post!

Standard deviation (SD) and standard error (SE) both measure variability. High values of either statistic indicate more dispersion. However, that’s where the similarities end. The standard deviation is not the same as the standard error. [Read more…] about Difference Between Standard Deviation and Standard Error

Filed Under: Basics Tagged With: conceptual, distributions, graphs

How to Find the P value: Process and Calculations

By Jim Frost 2 Comments

P values are everywhere in statistics. They’re in all types of hypothesis tests. But how do you calculate a p-value? Unsurprisingly, the precise calculations depend on the test. However, there is a general process that applies to finding a p value.

In this post, you’ll learn how to find the p value. I’ll start by showing you the general process for all hypothesis tests. Then I’ll move on to a step-by-step example showing the calculations for a p value. This post includes a calculator so you can apply what you learn. [Read more…] about How to Find the P value: Process and Calculations

Filed Under: Hypothesis Testing

  • « Go to Previous Page
  • Go to page 1
  • Go to page 2
  • Go to page 3
  • Go to page 4
  • Interim pages omitted …
  • Go to page 14
  • Go to Next Page »

Primary Sidebar

Meet Jim

I’ll help you intuitively understand statistics by focusing on concepts and using plain English so you can concentrate on understanding your results.

Read More...

Buy My Introduction to Statistics Book!

Cover of my Introduction to Statistics: An Intuitive Guide ebook.

Buy My Hypothesis Testing Book!

Cover image of my Hypothesis Testing: An Intuitive Guide ebook.

Buy My Regression Book!

Cover for my ebook, Regression Analysis: An Intuitive Guide for Using and Interpreting Linear Models.

Subscribe by Email

Enter your email address to receive notifications of new posts by email.

    I won't send you spam. Unsubscribe at any time.

    Follow Me

    • FacebookFacebook
    • RSS FeedRSS Feed
    • TwitterTwitter

    Top Posts

    • How to Interpret P-values and Coefficients in Regression Analysis
    • How To Interpret R-squared in Regression Analysis
    • Mean, Median, and Mode: Measures of Central Tendency
    • Multicollinearity in Regression Analysis: Problems, Detection, and Solutions
    • How to do t-Tests in Excel
    • How to Find the P value: Process and Calculations
    • Interpreting Correlation Coefficients
    • Z-table
    • Difference between Descriptive and Inferential Statistics
    • Choosing the Correct Type of Regression Analysis

    Recent Posts

    • Monte Carlo Simulation: Make Better Decisions
    • Principal Component Analysis Guide & Example
    • Fishers Exact Test: Using & Interpreting
    • Percent Change: Formula and Calculation Steps
    • X and Y Axis in Graphs
    • Simpsons Paradox Explained

    Recent Comments

    • Jim Frost on Monte Carlo Simulation: Make Better Decisions
    • Gilberto on Monte Carlo Simulation: Make Better Decisions
    • Sultan Mahmood on Linear Regression Equation Explained
    • Sanjay Kumar P on What is the Mean and How to Find It: Definition & Formula
    • Dave on Control Variables: Definition, Uses & Examples

    Copyright © 2023 · Jim Frost · Privacy Policy