• Skip to secondary menu
  • Skip to main content
  • Skip to primary sidebar
  • My Store
  • Glossary
  • Home
  • About Me
  • Contact Me

Statistics By Jim

Making statistics intuitive

  • Graphs
  • Basics
  • Hypothesis Testing
  • Regression
  • ANOVA
  • Probability
  • Time Series
  • Fun

analysis example

Chi-Square Goodness of Fit Test: Uses & Examples

By Jim Frost 2 Comments

The chi-square goodness of fit test evaluates whether proportions of categorical or discrete outcomes in a sample follow a population distribution with hypothesized proportions. In other words, when you draw a random sample, do the observed proportions follow the values that theory suggests. [Read more…] about Chi-Square Goodness of Fit Test: Uses & Examples

Filed Under: Hypothesis Testing Tagged With: analysis example, conceptual, distributions, interpreting results

Inter-Rater Reliability: Definition, Examples & Assessing

By Jim Frost Leave a Comment

What is Inter-Rater Reliability?

Inter-rater reliability measures the agreement between subjective ratings by multiple raters, inspectors, judges, or appraisers. It answers the question, is the rating system consistent? High inter-rater reliability indicates that multiple raters’ ratings for the same item are consistent. Conversely, low reliability means they are inconsistent. [Read more…] about Inter-Rater Reliability: Definition, Examples & Assessing

Filed Under: Hypothesis Testing Tagged With: analysis example, conceptual, interpreting results

Linear Regression

By Jim Frost 7 Comments

What is Linear Regression?

Linear regression models the relationships between at least one explanatory variable and an outcome variable. These variables are known as the independent and dependent variables, respectively. When there is one independent variable (IV), the procedure is known as simple linear regression. When there are more IVs, statisticians refer to it as multiple regression. [Read more…] about Linear Regression

Filed Under: Regression Tagged With: analysis example, conceptual

5 Number Summary: Definition, Finding & Using

By Jim Frost 6 Comments

What is the 5 Number Summary?

The 5 number summary is an exploratory data analysis tool that provides insight into the distribution of values for one variable. Collectively, this set of statistics describes where data values occur, their central tendency, variability, and the general shape of their distribution. [Read more…] about 5 Number Summary: Definition, Finding & Using

Filed Under: Basics Tagged With: analysis example, distributions, interpreting results

Paired T Test: Definition & When to Use It

By Jim Frost 2 Comments

What is a Paired T Test?

Use a paired t-test when each subject has a pair of measurements, such as a before and after score. A paired t-test determines whether the mean change for these pairs is significantly different from zero. This test is an inferential statistics procedure because it uses samples to draw conclusions about populations.

Paired t tests are also known as a paired sample t-test or a dependent samples t test. These names reflect the fact that the two samples are paired or dependent because they contain the same subjects. Conversely, an independent samples t test contains different subjects in the two samples. [Read more…] about Paired T Test: Definition & When to Use It

Filed Under: Hypothesis Testing Tagged With: analysis example, assumptions, choosing analysis, interpreting results

Independent Samples T Test: Definition, Using & Interpreting

By Jim Frost 2 Comments

What is an Independent Samples T Test?

Use an independent samples t test when you want to compare the means of precisely two groups—no more and no less! Typically, you perform this test to determine whether two population means are different. This procedure is an inferential statistical hypothesis test, meaning it uses samples to draw conclusions about populations. The independent samples t test is also known as the two sample t test. [Read more…] about Independent Samples T Test: Definition, Using & Interpreting

Filed Under: Hypothesis Testing Tagged With: analysis example, assumptions, choosing analysis, interpreting results

Conditional Probability: Definition, Formula & Examples

By Jim Frost 4 Comments

What is Conditional Probability?

A conditional probability is the likelihood of an event occurring given that another event has already happened. Conditional probabilities allow you to evaluate how prior information affects probabilities. For example, what is the probability of A given B has occurred? When you incorporate existing facts into the calculations, it can change the likelihood of an outcome. [Read more…] about Conditional Probability: Definition, Formula & Examples

Filed Under: Probability Tagged With: analysis example, conceptual

Scatterplots: Using, Examples, and Interpreting

By Jim Frost 2 Comments

Use scatterplots to show relationships between pairs of continuous variables. These graphs display symbols at the X, Y coordinates of the data points for the paired variables. Scatterplots are also known as scattergrams and scatter charts. [Read more…] about Scatterplots: Using, Examples, and Interpreting

Filed Under: Graphs Tagged With: analysis example, choosing analysis, data types, interpreting results

Pie Charts: Using, Examples, and Interpreting

By Jim Frost Leave a Comment

Use pie charts to compare the sizes of categories to the entire dataset. To create a pie chart, you must have a categorical variable that divides your data into groups. These graphs consist of a circle (i.e., the pie) with slices representing subgroups. The size of each slice is proportional to the relative size of each category out of the whole. [Read more…] about Pie Charts: Using, Examples, and Interpreting

Filed Under: Graphs Tagged With: analysis example, choosing analysis, data types, interpreting results

Bar Charts: Using, Examples, and Interpreting

By Jim Frost 4 Comments

Use bar charts to compare categories when you have at least one categorical or discrete variable. Each bar represents a summary value for one discrete level, where longer bars indicate higher values. Types of summary values include counts, sums, means, and standard deviations. Bar charts are also known as bar graphs. [Read more…] about Bar Charts: Using, Examples, and Interpreting

Filed Under: Graphs Tagged With: analysis example, choosing analysis, data types, interpreting results

Line Charts: Using, Examples, and Interpreting

By Jim Frost 2 Comments

Use line charts to display a series of data points that are connected by lines. Analysts use line charts to emphasize changes in a metric on the vertical Y-axis by another variable on the horizontal X-axis. Often, the X-axis reflects time, but not always. Line charts are also known as line plots. [Read more…] about Line Charts: Using, Examples, and Interpreting

Filed Under: Graphs Tagged With: analysis example, choosing analysis, data types, interpreting results

Dot Plots: Using, Examples, and Interpreting

By Jim Frost Leave a Comment

Use dot plots to display the distribution of your sample data when you have continuous variables. These graphs stack dots along the horizontal X-axis to represent the frequencies of different values. More dots indicate greater frequency. Each dot represents a set number of observations. [Read more…] about Dot Plots: Using, Examples, and Interpreting

Filed Under: Graphs Tagged With: analysis example, choosing analysis, data types, distributions, interpreting results

Empirical Cumulative Distribution Function (CDF) Plots

By Jim Frost Leave a Comment

Use an empirical cumulative distribution function plot to display the data points in your sample from lowest to highest against their percentiles. These graphs require continuous variables and allow you to derive percentiles and other distribution properties. This function is also known as the empirical CDF or ECDF. [Read more…] about Empirical Cumulative Distribution Function (CDF) Plots

Filed Under: Graphs Tagged With: analysis example, choosing analysis, data types, interpreting results

Using Excel to Calculate Correlation

By Jim Frost Leave a Comment

Excel can calculate correlation coefficients and a variety of other statistical analyses. Even if you don’t use Excel regularly, this post is an excellent introduction to calculating and interpreting correlation.

In this post, I provide step-by-step instructions for having Excel calculate Pearson’s correlation coefficient, and I’ll show you how to interpret the results. Additionally, I include links to relevant statistical resources I’ve written that provide intuitive explanations. Together, we’ll analyze and interpret an example dataset! [Read more…] about Using Excel to Calculate Correlation

Filed Under: Basics Tagged With: analysis example, Excel, graphs, interpreting results

Autocorrelation and Partial Autocorrelation in Time Series Data

By Jim Frost 4 Comments

Autocorrelation is the correlation between two observations at different points in a time series. For example, values that are separated by an interval might have a strong positive or negative correlation. When these correlations are present, they indicate that past values influence the current value. Analysts use the autocorrelation and partial autocorrelation functions to understand the properties of time series data, fit the appropriate models, and make forecasts.

In this post, I cover both the autocorrelation function and partial autocorrelation function. You’ll learn about the differences between these functions and what they can tell you about your data. In later posts, I’ll show you how to incorporate this information in regression models of time series data and other time-series analyses.

Autocorrelation and Partial Autocorrelation Basics

Autocorrelation is the correlation between two values in a time series. In other words, the time series data correlate with themselves—hence, the name. We talk about these correlations using the term “lags.” Analysts record time-series data by measuring a characteristic at evenly spaced intervals—such as daily, monthly, or yearly. The number of intervals between the two observations is the lag. For example, the lag between the current and previous observation is one. If you go back one more interval, the lag is two, and so on.

In mathematical terms, the observations at yt and yt–k are separated by k time units. K is the lag. This lag can be days, quarters, or years depending on the nature of the data. When k=1, you’re assessing adjacent observations. For each lag, there is a correlation.

The autocorrelation function (ACF) assesses the correlation between observations in a time series for a set of lags. The ACF for time series y is given by: Corr (yt,yt−k), k=1,2,….

Analysts typically use graphs to display this function.

Related posts: Time Series Analysis Introduction and Interpreting Correlations

Autocorrelation Function (ACF)

Use the autocorrelation function (ACF) to identify which lags have significant correlations, understand the patterns and properties of the time series, and then use that information to model the time series data. From the ACF, you can assess the randomness and stationarity of a time series. You can also determine whether trends and seasonal patterns are present.

In an ACF plot, each bar represents the size and direction of the correlation. Bars that extend across the red line are statistically significant.

Randomness/White Noise

For random data, autocorrelations should be near zero for all lags. Analysts also refer to this condition as white noise. Non-random data have at least one significant lag. When the data are not random, it’s a good indication that you need to use a time series analysis or incorporate lags into a regression analysis to model the data appropriately.

Autocorrelation function plot for random data.

This ACF plot indicates that these time series data are random.

Stationarity

Stationarity means that the time series does not have a trend, has a constant variance, a constant autocorrelation pattern, and no seasonal pattern. The autocorrelation function declines to near zero rapidly for a stationary time series. In contrast, the ACF drops slowly for a non-stationary time series.

Autocorrelation function plot of stationary time series data.

In this chart for a stationary time series, notice how the autocorrelations decline to non-significant levels quickly.

Trends

When trends are present in a time series, shorter lags typically have large positive correlations because observations closer in time tend to have similar values. The correlations taper off slowly as the lags increase.

Autocorrelations plot for metal sales that indicates a trend is present.

In this ACF plot for metal sales, the autocorrelations decline slowly. The first five lags are significant.

Seasonality

When seasonal patterns are present, the autocorrelations are larger for lags at multiples of the seasonal frequency than for other lags.

When a time series has both a trend and seasonality, the ACF plot displays a mixture of both effects. That’s the case in the autocorrelation function plot for the carbon dioxide (CO2) dataset from NIST. This dataset contains monthly mean CO2 measurements at the Mauna Loa Observatory. Download the CO2_Data.

Autocorrelation plot of carbon dioxide data.

Notice how you can see the wavy correlations for the seasonal pattern and the slowly diminishing lags of a trend.

Partial Autocorrelation Function (PACF)

The partial autocorrelation function is similar to the ACF except that it displays only the correlation between two observations that the shorter lags between those observations do not explain. For example, the partial autocorrelation for lag 3 is only the correlation that lags 1 and 2 do not explain. In other words, the partial correlation for each lag is the unique correlation between those two observations after partialling out the intervening correlations.

As you saw, the autocorrelation function helps assess the properties of a time series. In contrast, the partial autocorrelation function (PACF) is more useful during the specification process for an autoregressive model. Analysts use partial autocorrelation plots to specify regression models with time series data and Auto Regressive Integrated Moving Average (ARIMA) models. I’ll focus on that aspect in posts about those methods.

Related post: Using Moving Averages to Smooth Time Series Data

For this post, I’ll show you a quick example of a PACF plot. Typically, you will use the ACF to determine whether an autoregressive model is appropriate. If it is, you then use the PACF to help you choose the model terms.

This partial autocorrelation plot displays data from the southern oscillations dataset from NIST. The southern oscillations refer to changes in the barometric pressure near Tahiti that predicts El Niño. Download the southern_oscillations_data.

Partial autocorrelation plot for the southern oscillation data.

On the graph, the partial autocorrelations for lags 1 and 2 are statistically significant. The subsequent lags are nearly significant. Consequently, this PACF suggests fitting either a second or third-order autoregressive model.

By assessing the autocorrelation and partial autocorrelation patterns in your data, you can understand the nature of your time series and model it!

Filed Under: Time Series Tagged With: analysis example, conceptual, graphs

Using Combinations to Calculate Probabilities

By Jim Frost 6 Comments

Combinations in probability theory and other areas of mathematics refer to a sequence of outcomes where the order does not matter. For example, when you’re ordering a pizza, it doesn’t matter whether you order it with ham, mushrooms, and olives or olives, mushrooms, and ham. You’re getting the same pizza! [Read more…] about Using Combinations to Calculate Probabilities

Filed Under: Probability Tagged With: analysis example, choosing analysis, conceptual

Using Permutations to Calculate Probabilities

By Jim Frost 7 Comments

Permutations in probability theory and other branches of mathematics refer to sequences of outcomes where the order matters. For example, 9-6-8-4 is a permutation of a four-digit PIN because the order of numbers is crucial. When calculating probabilities, it’s frequently necessary to calculate the number of possible permutations to determine an event’s probability.

In this post, I explain permutations and show how to calculate the number of permutations both with repetition and without repetition. Finally, we’ll work through a step-by-step example problem that uses permutations to calculate a probability. [Read more…] about Using Permutations to Calculate Probabilities

Filed Under: Probability Tagged With: analysis example, choosing analysis, conceptual

Understanding Historians’ Rankings of U.S. Presidents using Regression Models

By Jim Frost 9 Comments

Historians rank the U.S. Presidents from best to worse using all the historical knowledge at their disposal. Frequently, groups, such as C-Span, ask these historians to rank the Presidents and average the results together to help reduce bias. The idea is to produce a set of rankings that incorporates a broad range of historians, a vast array of information, and a historical perspective. These rankings include informed assessments of each President’s effectiveness, leadership, moral authority, administrative skills, economic management, vision, and so on. [Read more…] about Understanding Historians’ Rankings of U.S. Presidents using Regression Models

Filed Under: Regression Tagged With: analysis example, graphs, interpreting results

Spearman’s Correlation Explained

By Jim Frost 34 Comments

Spearman’s correlation in statistics is a nonparametric alternative to Pearson’s correlation. Use Spearman’s correlation for data that follow curvilinear, monotonic relationships and for ordinal data. Statisticians also refer to Spearman’s rank order correlation coefficient as Spearman’s ρ (rho).

In this post, I’ll cover what all that means so you know when and why you should use Spearman’s correlation instead of the more common Pearson’s correlation. [Read more…] about Spearman’s Correlation Explained

Filed Under: Basics Tagged With: analysis example, choosing analysis, conceptual, data types, Excel, graphs

Multiplication Rule for Calculating Probabilities

By Jim Frost 7 Comments

The multiplication rule in probability allows you to calculate the probability of multiple events occurring together using known probabilities of those events individually. There are two forms of this rule, the specific and general multiplication rules.

In this post, learn about when and how to use both the specific and general multiplication rules. Additionally, I’ll use and explain the standard notation for probabilities throughout, helping you learn how to interpret it. We’ll work through several example problems so you can see them in action. There’s even a bonus problem at the end! [Read more…] about Multiplication Rule for Calculating Probabilities

Filed Under: Probability Tagged With: analysis example, choosing analysis, conceptual

  • Go to page 1
  • Go to page 2
  • Go to page 3
  • Go to page 4
  • Go to Next Page »

Primary Sidebar

Meet Jim

I’ll help you intuitively understand statistics by focusing on concepts and using plain English so you can concentrate on understanding your results.

Read More...

Buy My Introduction to Statistics eBook!

New! Buy My Hypothesis Testing eBook!

Buy My Regression eBook!

Subscribe by Email

Enter your email address to receive notifications of new posts by email.

    I won't send you spam. Unsubscribe at any time.

    Follow Me

    • FacebookFacebook
    • RSS FeedRSS Feed
    • TwitterTwitter
    • Popular
    • Latest
    Popular
    • How To Interpret R-squared in Regression Analysis
    • How to Interpret P-values and Coefficients in Regression Analysis
    • Measures of Central Tendency: Mean, Median, and Mode
    • Normal Distribution in Statistics
    • Multicollinearity in Regression Analysis: Problems, Detection, and Solutions
    • How to Interpret the F-test of Overall Significance in Regression Analysis
    • Understanding Interaction Effects in Statistics
    Latest
    • How to Find the P value: Process and Calculations
    • Sampling Methods: Different Types in Research
    • Beta Distribution: Uses, Parameters & Examples
    • Geometric Distribution: Uses, Calculator & Formula
    • What is Power in Statistics?
    • Conditional Distribution: Definition & Finding
    • Marginal Distribution: Definition & Finding

    Recent Comments

    • Hannah on How to Interpret Adjusted R-Squared and Predicted R-Squared in Regression Analysis
    • James on Introduction to Bootstrapping in Statistics with an Example
    • Jim Frost on Introduction to Bootstrapping in Statistics with an Example
    • Jim Frost on How To Interpret R-squared in Regression Analysis
    • Jim Frost on Comparing Regression Lines with Hypothesis Tests

    Copyright © 2022 · Jim Frost · Privacy Policy