• Skip to secondary menu
  • Skip to main content
  • Skip to primary sidebar
  • Home
  • About Me
  • Contact Me

Statistics By Jim

Making statistics intuitive

  • Basics
  • Hypothesis Testing
  • Regression
  • ANOVA
  • Time Series
  • Fun
  • Glossary
  • Blog
  • My Store

Time Series

Using Moving Averages to Smooth Time Series Data

By Jim Frost 9 Comments

Moving averages can smooth time series data, reveal underlying trends, and identify components for use in statistical modeling. Smoothing is the process of removing random variations that appear as coarseness in a plot of raw time series data. It reduces the noise to emphasize the signal that can contain trends and cycles. Analysts also refer to the smoothing process as filtering the data.

Developed in the 1920s, the moving average is the oldest process for smoothing data and continues to be a useful tool today. This method relies on the notion that observations close in time are likely to have similar values. Consequently, the averaging removes random variation, or noise, from the data.

In this post, I look at using moving averages to smooth time series data. This method is the simplest form of smoothing. In future posts, I’ll explore more complex ways of smoothing.

What are Moving Averages?

Moving averages are a series of averages calculated using sequential segments of data points over a series of values. They have a length, which defines the number of data points to include in each average.

One-sided moving averages

One-sided moving averages include the current and previous observations for each average. For example, the formula for a moving average (MA) of X at time t with a length of 7 is the following:

MA_{7} = {\displaystyle \frac {X_{t-6}+X_{t-5}+X_{t-4}+X_{t-3}+X_{t-2}+X_{t-1}+X_{t}}{7}}

In the graph, the circled one-sided moving average uses the seven observations that fall within the red interval. The subsequent moving average shifts the interval to the right by one observation. And, so on.

Illustration of a one-sided moving average.

Centered moving averages

Centered moving averages include both previous and future observations to calculate the average at a given point in time. In other words, centered moving averages use observations that surround it in both directions and, consequently, are also known as two-sided moving averages. The formula for a centered moving average of X at time t with a length of 7 is the following:

MA_{7} = {\displaystyle \frac {X_{t-3}+X_{t-2}+X_{t-1}+X_{t}+X_{t+1}+X_{t+2}+X_{t+3}}{7}}

In the plot below, the circled centered moving average uses the seven observations in the red interval. The next moving average shifts the interval to the right by one.

Illustration of a centered moving average.

Centered intervals work out evenly for an odd number of observations because they allow for an equal amount of observations before and after the moving average. However, when you have an even length, the calculations must adjust for that by using a weighted moving average. For example, the formula for a centered moving average with a length of 8 is as follows:

MA_{8} = {\displaystyle \frac {(0.5*X_{t-4})+X_{t-3}+X_{t-2}+X_{t-1}+X_{t}+X_{t+1}+X_{t+2}+X_{t+3}+(0.5*X_{t+4})}{8}}

For a length of 8, the calculations incorporate the formula for a length of 7 (t-3 through t+3). Then, it extends the segment by one observation in both directions (t-4 and t+4). However, those two observations each have half the weight, which yields the equivalent of 7 + 2*0.5 = 8 data points.

Using Moving Averages to Reveal Trends

Moving averages can remove seasonal patterns to reveal underlying trends. In future posts, I’ll write more about time series components and incorporating them into models for accurate forecasting. For now, we’ll work through an example to visually assess a trend.

When there is a seasonal pattern in your data and you want to remove it, set the length of your moving average to equal the pattern’s length. If there is no seasonal pattern in your data, choose a length that makes sense. Longer lengths will produce smoother lines.

Note that the term “seasonal” pattern doesn’t necessarily indicate a meteorological season. Instead, it refers to a repeating pattern that has a fixed length in your data.

Time Series Example: Daily COVID-19 Deaths in Florida

For our example, I’ll use daily COVID-19 deaths in the State of Florida. The time series plot below displays a recurring pattern in the number of daily deaths.

Time series plot of Florida's daily COVID-19 deaths.

This pattern likely reflects a data artifact. We know the coronavirus does not operate on a seven-day weekly schedule! Instead, it must reflect some human-based scheduling factor that influences when causes of death are determined and recorded. Some of these activities must be less likely to occur on weekends because the lowest day of the week is almost always Sunday, and weekends, in general, tend to be low. Tuesdays are often the highest day of the week. Perhaps that is when the weekend backlog shows up in the data?

Because of this seasonal pattern, the number of recorded deaths for a particular day depends on the day of the week you’re evaluating. Let’s remove this season pattern to reveal the underlying trend component. The original data are from Johns Hopkins University. Download my Excel spreadsheet: Florida Deaths Time Series.

Time series plot with moving average of daily COVID-19 deaths in Florida.

The graph displays one-sided moving averages with a length of 7 days for these data. Notice how the seasonal pattern is gone and the underlying trend is visible. Each moving average point is the daily average of the past seven days. We can look at any date, and the day of the week no longer plays a role. We can see that the trend increases up to April 17, 2020. It plateaus, with a slight decline, until around June 22nd. Since then, there is an upward trend that appears to steepen at the end.

Smoothing time series data helps reveal the underlying trends in your data. That process can aid in the simple visual assessment of the data, as seen in this article. However, it can also help you fit the best time series model to your data. The moving average is a simple but very effective calculation!

Filed Under: Time Series Tagged With: analysis example, conceptual, Excel

Time Series Analysis Introduction

By Jim Frost 17 Comments

Time series analysis tracks characteristics of a process at regular time intervals. It’s a fundamental method for understanding how a metric changes over time and forecasting future values. Analysts use time series methods in a wide variety of contexts. [Read more…] about Time Series Analysis Introduction

Filed Under: Time Series

Primary Sidebar

Meet Jim

I’ll help you intuitively understand statistics by focusing on concepts and using plain English so you can concentrate on understanding your results.

Read More…

Buy My Introduction to Statistics eBook!

New! Buy My Hypothesis Testing eBook!

Buy My Regression eBook!

Subscribe by Email

Enter your email address to receive notifications of new posts by email.

    I won't send you spam. Unsubscribe at any time.

    Follow Me

    • FacebookFacebook
    • RSS FeedRSS Feed
    • TwitterTwitter
    • Popular
    • Latest
    Popular
    • How To Interpret R-squared in Regression Analysis
    • How to Interpret P-values and Coefficients in Regression Analysis
    • Measures of Central Tendency: Mean, Median, and Mode
    • Normal Distribution in Statistics
    • Multicollinearity in Regression Analysis: Problems, Detection, and Solutions
    • How to Interpret the F-test of Overall Significance in Regression Analysis
    • Understanding Interaction Effects in Statistics
    Latest
    • Using Applied Statistics to Expand Human Knowledge
    • Variance Inflation Factors (VIFs)
    • Assessing a COVID-19 Vaccination Experiment and Its Results
    • P-Values, Error Rates, and False Positives
    • How to Perform Regression Analysis using Excel
    • Coefficient of Variation in Statistics
    • Independent and Dependent Samples in Statistics

    Recent Comments

    • Samiullah on 7 Classical Assumptions of Ordinary Least Squares (OLS) Linear Regression
    • Javier Gonzalez on The Monty Hall Problem: A Statistical Illusion
    • Micheal on 7 Classical Assumptions of Ordinary Least Squares (OLS) Linear Regression
    • Jim Frost on Using Applied Statistics to Expand Human Knowledge
    • Jim Frost on Using Moving Averages to Smooth Time Series Data

    Copyright © 2021 · Jim Frost · Privacy Policy