• Skip to secondary menu
  • Skip to main content
  • Skip to primary sidebar
  • Home
  • About Me
  • Contact Me

Statistics By Jim

Making statistics intuitive

  • Basics
  • Hypothesis Testing
  • Regression
  • ANOVA
  • Probability
  • Time Series
  • Fun
  • Glossary
  • My Store

Excel

Descriptive Statistics in Excel

By Jim Frost 17 Comments

Descriptive statistics summarize your dataset, painting a picture of its properties. These properties include various central tendency and variability measures, distribution properties, outlier detection, and other information. Unlike inferential statistics, descriptive statistics only describe your dataset’s characteristics and do not attempt to generalize from a sample to a population. [Read more…] about Descriptive Statistics in Excel

Filed Under: Basics Tagged With: analysis example, Excel, interpreting results

How to Perform Regression Analysis using Excel

By Jim Frost 14 Comments

Excel can perform various statistical analyses, including regression analysis. It is a great option because nearly everyone can access Excel. This post is an excellent introduction to performing and interpreting regression analysis, even if Excel isn’t your primary statistical software package.

[Read more…] about How to Perform Regression Analysis using Excel

Filed Under: Regression Tagged With: analysis example, Excel, interpreting results

Using Moving Averages to Smooth Time Series Data

By Jim Frost 9 Comments

Moving averages can smooth time series data, reveal underlying trends, and identify components for use in statistical modeling. Smoothing is the process of removing random variations that appear as coarseness in a plot of raw time series data. It reduces the noise to emphasize the signal that can contain trends and cycles. Analysts also refer to the smoothing process as filtering the data.

Developed in the 1920s, the moving average is the oldest process for smoothing data and continues to be a useful tool today. This method relies on the notion that observations close in time are likely to have similar values. Consequently, the averaging removes random variation, or noise, from the data.

In this post, I look at using moving averages to smooth time series data. This method is the simplest form of smoothing. In future posts, I’ll explore more complex ways of smoothing.

What are Moving Averages?

Moving averages are a series of averages calculated using sequential segments of data points over a series of values. They have a length, which defines the number of data points to include in each average.

One-sided moving averages

One-sided moving averages include the current and previous observations for each average. For example, the formula for a moving average (MA) of X at time t with a length of 7 is the following:

MA_{7} = {\displaystyle \frac {X_{t-6}+X_{t-5}+X_{t-4}+X_{t-3}+X_{t-2}+X_{t-1}+X_{t}}{7}}

In the graph, the circled one-sided moving average uses the seven observations that fall within the red interval. The subsequent moving average shifts the interval to the right by one observation. And, so on.

Illustration of a one-sided moving average.

Centered moving averages

Centered moving averages include both previous and future observations to calculate the average at a given point in time. In other words, centered moving averages use observations that surround it in both directions and, consequently, are also known as two-sided moving averages. The formula for a centered moving average of X at time t with a length of 7 is the following:

MA_{7} = {\displaystyle \frac {X_{t-3}+X_{t-2}+X_{t-1}+X_{t}+X_{t+1}+X_{t+2}+X_{t+3}}{7}}

In the plot below, the circled centered moving average uses the seven observations in the red interval. The next moving average shifts the interval to the right by one.

Illustration of a centered moving average.

Centered intervals work out evenly for an odd number of observations because they allow for an equal amount of observations before and after the moving average. However, when you have an even length, the calculations must adjust for that by using a weighted moving average. For example, the formula for a centered moving average with a length of 8 is as follows:

MA_{8} = {\displaystyle \frac {(0.5*X_{t-4})+X_{t-3}+X_{t-2}+X_{t-1}+X_{t}+X_{t+1}+X_{t+2}+X_{t+3}+(0.5*X_{t+4})}{8}}

For a length of 8, the calculations incorporate the formula for a length of 7 (t-3 through t+3). Then, it extends the segment by one observation in both directions (t-4 and t+4). However, those two observations each have half the weight, which yields the equivalent of 7 + 2*0.5 = 8 data points.

Using Moving Averages to Reveal Trends

Moving averages can remove seasonal patterns to reveal underlying trends. In future posts, I’ll write more about time series components and incorporating them into models for accurate forecasting. For now, we’ll work through an example to visually assess a trend.

When there is a seasonal pattern in your data and you want to remove it, set the length of your moving average to equal the pattern’s length. If there is no seasonal pattern in your data, choose a length that makes sense. Longer lengths will produce smoother lines.

Note that the term “seasonal” pattern doesn’t necessarily indicate a meteorological season. Instead, it refers to a repeating pattern that has a fixed length in your data.

Time Series Example: Daily COVID-19 Deaths in Florida

For our example, I’ll use daily COVID-19 deaths in the State of Florida. The time series plot below displays a recurring pattern in the number of daily deaths.

Time series plot of Florida's daily COVID-19 deaths.

This pattern likely reflects a data artifact. We know the coronavirus does not operate on a seven-day weekly schedule! Instead, it must reflect some human-based scheduling factor that influences when causes of death are determined and recorded. Some of these activities must be less likely to occur on weekends because the lowest day of the week is almost always Sunday, and weekends, in general, tend to be low. Tuesdays are often the highest day of the week. Perhaps that is when the weekend backlog shows up in the data?

Because of this seasonal pattern, the number of recorded deaths for a particular day depends on the day of the week you’re evaluating. Let’s remove this season pattern to reveal the underlying trend component. The original data are from Johns Hopkins University. Download my Excel spreadsheet: Florida Deaths Time Series.

Time series plot with moving average of daily COVID-19 deaths in Florida.

The graph displays one-sided moving averages with a length of 7 days for these data. Notice how the seasonal pattern is gone and the underlying trend is visible. Each moving average point is the daily average of the past seven days. We can look at any date, and the day of the week no longer plays a role. We can see that the trend increases up to April 17, 2020. It plateaus, with a slight decline, until around June 22nd. Since then, there is an upward trend that appears to steepen at the end.

Smoothing time series data helps reveal the underlying trends in your data. That process can aid in the simple visual assessment of the data, as seen in this article. However, it can also help you fit the best time series model to your data. The moving average is a simple but very effective calculation!

Filed Under: Time Series Tagged With: analysis example, conceptual, Excel

Answering the Birthday Problem in Statistics

By Jim Frost 15 Comments

The Birthday Problem in statistics asks, how many people do you need in a group to have a 50% chance that at least two people will share a birthday? Go ahead and think about that for a moment. The answer surprises many people. We’ll get to that shortly.

In this post, I’ll not only answer the birthday paradox, but I’ll also show you how to calculate the probabilities for any size group, run a computer simulation of it, and explain why the answer to the Birthday Problem is so surprising. [Read more…] about Answering the Birthday Problem in Statistics

Filed Under: Fun Tagged With: Excel, graphs, probability

How to Test Variances in Excel

By Jim Frost 3 Comments

Use a variances test to determine whether the variability of two groups differs. In this post, we’ll work through a two-sample variances test that Excel provides. Even if Excel isn’t your primary statistical software, this post provides an excellent introduction to variance tests. Excel refers to this analysis as F-Test Two-Sample for Variances. [Read more…] about How to Test Variances in Excel

Filed Under: Hypothesis Testing Tagged With: analysis example, Excel, interpreting results

How to do Two-Way ANOVA in Excel

By Jim Frost 22 Comments

Use two-way ANOVA to assess differences between the group means that are defined by two categorical factors. In this post, we’ll work through two-way ANOVA using Excel. Even if Excel isn’t your main statistical package, this post is an excellent introduction to two-way ANOVA. Excel refers to this analysis as two factor ANOVA. [Read more…] about How to do Two-Way ANOVA in Excel

Filed Under: ANOVA Tagged With: analysis example, Excel, interpreting results

How to do One-Way ANOVA in Excel

By Jim Frost 18 Comments

Use one-way ANOVA to determine whether the means of at least three groups are different. Excel refers to this test as Single Factor ANOVA. This post is an excellent introduction to performing and interpreting one-way ANOVA even if Excel isn’t your primary statistical software package. [Read more…] about How to do One-Way ANOVA in Excel

Filed Under: ANOVA Tagged With: analysis example, Excel, interpreting results

How to do t-Tests in Excel

By Jim Frost 68 Comments

Excel can perform various statistical analyses, including t-tests. It is an excellent option because nearly everyone can access Excel. This post is a great introduction to performing and interpreting t-tests even if Excel isn’t your primary statistical software package.

In this post, I provide step-by-step instructions for using Excel to perform t-tests. Importantly, I also show you how to select the correct form of t-test, choose the right options, and interpret the results. I also include links to additional resources I’ve written, which present clear explanations of relevant t-test concepts that you won’t find in Excel’s documentation. And, I use an example dataset for us to work through and interpret together! [Read more…] about How to do t-Tests in Excel

Filed Under: Hypothesis Testing Tagged With: analysis example, Excel, interpreting results

Primary Sidebar

Meet Jim

I’ll help you intuitively understand statistics by focusing on concepts and using plain English so you can concentrate on understanding your results.

Read More…

Buy My Introduction to Statistics eBook!

New! Buy My Hypothesis Testing eBook!

Buy My Regression eBook!

Subscribe by Email

Enter your email address to receive notifications of new posts by email.

    I won't send you spam. Unsubscribe at any time.

    Follow Me

    • FacebookFacebook
    • RSS FeedRSS Feed
    • TwitterTwitter
    • Popular
    • Latest
    Popular
    • How To Interpret R-squared in Regression Analysis
    • How to Interpret P-values and Coefficients in Regression Analysis
    • Measures of Central Tendency: Mean, Median, and Mode
    • Normal Distribution in Statistics
    • Multicollinearity in Regression Analysis: Problems, Detection, and Solutions
    • How to Interpret the F-test of Overall Significance in Regression Analysis
    • Understanding Interaction Effects in Statistics
    Latest
    • Descriptive Statistics in Excel
    • Using Contingency Tables to Calculate Probabilities
    • Probability Fundamentals
    • Using Applied Statistics to Expand Human Knowledge
    • Variance Inflation Factors (VIFs)
    • Assessing a COVID-19 Vaccination Experiment and Its Results
    • P-Values, Error Rates, and False Positives

    Recent Comments

    • Jim Frost on The Gauss-Markov Theorem and BLUE OLS Coefficient Estimates
    • RABIA NOUSHEEN on When Do You Need to Standardize the Variables in a Regression Model?
    • Patrick on The Gauss-Markov Theorem and BLUE OLS Coefficient Estimates
    • Prima Silvestre on Descriptive Statistics in Excel
    • Jim Frost on 5 Ways to Find Outliers in Your Data

    Copyright © 2021 · Jim Frost · Privacy Policy