• Skip to secondary menu
  • Skip to main content
  • Skip to primary sidebar
  • My Store
  • Glossary
  • Home
  • About Me
  • Contact Me

Statistics By Jim

Making statistics intuitive

  • Graphs
  • Basics
  • Hypothesis Testing
  • Regression
  • ANOVA
  • Probability
  • Time Series
  • Fun
  • Calculators

Time Series Analysis

Rolling Average

By Jim Frost

What is a Rolling Average?

A rolling average, also called a moving average, is a technique that smooths out short-term fluctuations in data and highlights longer-term trends. It works by taking the average of a fixed number of consecutive observations (called the window) and then shifting that window forward one data point at a time.

For example, a 7-day rolling average of daily temperatures calculates the average temperature for Days 1 through 7, then Days 2 through 8, then Days 3 through 9, and so on. Each average becomes one point in the smoothed series.

Analysts commonly use rolling averages in time series data, such as stock prices, weather trends, or COVID-19 case counts. They reduce noise from day-to-day variation and help reveal the underlying pattern.

A good rule of thumb is to select a window size that reflects a natural cycle in the data. For instance, a 7-day window often works well for daily data with weekly patterns, such as retail sales or hospital admissions. Matching the window to the cycle helps the average filter out repeating ups and downs while preserving meaningful trends.

How to Calculate a Rolling Average

The following steps show you how to calculate a rolling average:

  1. Choose a window size such as 3, 7, or 30.
  2. For each position in the data series, take the average of the current value and the values that come before it (for a trailing average), using the total number of values in the window.
  3. Slide the window forward by one data point and repeat.

For example, suppose we have daily sales data for seven days:
5, 8, 6, 7, 10, 12, 9

The 3-day rolling average values are:

  • Day 3: (5 + 8 + 6) ÷ 3 = 6.33
  • Day 4: (8 + 6 + 7) ÷ 3 = 7.00
  • Day 5: (6 + 7 + 10) ÷ 3 = 7.67
  • Day 6: (7 + 10 + 12) ÷ 3 = 9.67
  • Day 7: (10 + 12 + 9) ÷ 3 = 10.33

Each value reflects the local trend over a short window of time.

For example, a 7-day rolling average of daily new COVID-19 deaths helped public health officials understand whether deaths are rising or falling without being misled by single-day spikes or dips due to reporting delays.

Time series plot with a rolling average of daily COVID-19 deaths in Florida.

Stationarity

By Jim Frost

In time series analysis, stationarity refers to a condition where the statistical properties of a time series—such as its mean, variance, and autocorrelation—remain constant over time. A stationary time series does not have trends, changing variability, or evolving seasonal patterns. This stability makes it easier to model and forecast, which is why many time series methods, including ARIMA, require the data to be stationary.

There are two main types:

  • Strict stationarity: The entire distribution of the process remains unchanged over time.
  • Weak (or second-order) stationarity: Only the mean, variance, and autocorrelation structure are constant. This form is more commonly used in practice.

The graphs below show two time series:

  • The left panel displays a stationary time series (constant mean and variance).
  • The right panel shows a non-stationary time series (a drifting mean).

Time series plot that displays stationarity and a non-stationary series.

Most real-world time series data are not stationary. Trends, seasonal cycles, or changing volatility can make a series non-stationary. To address this, analysts often apply a process called differencing—subtracting each value from the previous one—to remove trends or seasonality and transform the data into a stationary form. In some cases, multiple rounds of differencing may be needed.

For example, monthly sales data that steadily increase over time are not stationary due to the upward trend. But by applying differencing, the resulting series may fluctuate around a constant mean, making it suitable for models like ARIMA that assume stationarity. Identifying and correcting non-stationarity is a critical first step in reliable time series analysis.

Autoregressive Model [AR Model]

By Jim Frost

An autoregressive model (AR model) is a statistical model that analyzes and forecasts time series data. These models express the current value of a time series as a weighted sum of its previous values. The term “autoregressive” reflects this idea of self-reference—each value is regressed on past values of the same variable. After fitting the model, analysts can use it to forecast future values by applying the same structure to the most recent data.

Analysts commonly use AR models in economics, environmental science, and engineering for forecasting trends in time series where past behavior carries forward. For example, an economist might use an AR model to forecast next month’s unemployment rate based on previous months’ rates, assuming the pattern persists over time.

How an AR Model Works

An AR model is written as AR(p), where p is the number of lagged observations included in the model. For example, an AR(2) model uses the two previous time points to predict the current one. The general form looks like this:

Yₜ = ϕ₁Yₜ₋₁ + ϕ₂Yₜ₋₂ + … + ϕₚYₜ₋ₚ + εₜ

Where:

  • Yₜ is the value at time t,
  • ϕ₁ through ϕₚ are the model coefficients (weights),
  • εₜ is the random error at time t.

Autoregressive models assume the data are stationary, meaning its statistical properties like mean and variance don’t change over time. If the data aren’t stationary, it often needs to be transformed (e.g., by differencing) before fitting an AR model.

When fitting an autoregressive model, the goal is to estimate the coefficients (ϕ-values) that determine how much weight to assign to each previous value in the series. The model chooses these coefficients by minimizing the prediction errors—the differences between the actual values and the values predicted by the model. This process ensures that the model closely reflects the time-dependent structure in the data.

Example

For example, suppose an economist fits an AR(2) model to forecast monthly unemployment rates and the estimated model is:

Example formula for an autoregressive model.

If last month’s unemployment rate was 7.2% and the month before was 7.5%, the model predicts the current month’s rate as:

Example calculations using the autoregressive model formula to predict the next value.

This predicted value reflects how much influence recent months have on the current estimate, based on the learned weights from past data.

ARIMA

By Jim Frost

What is ARIMA?

An ARIMA model, short for AutoRegressive Integrated Moving Average, is a statistical method for modeling and forecasting time series data. It combines three components to model patterns in a dataset over time: autoregression (AR), differencing for stationarity (I), and moving averages (MA).

  • Autoregressive (AR): Models the current value as a function of its past values.
  • Integrated (I): Refers to differencing the data one or more times to remove trends and make the series stationary—meaning its properties stay consistent over time.
  • Moving average (MA): Models the current value as a function of past forecast errors.

ARIMA models are specifically designed to handle time series data, which often violate the assumptions of ordinary linear regression. Unlike regression, ARIMA accounts for autocorrelation, where past values influence future ones, and it can model non-stationary data by applying differencing to remove trends. It also incorporates moving average components to account for patterns in the residuals—something standard regression can’t handle. These capabilities make ARIMA well-suited for forecasting problems where a series of observations has an internal structure shaped by time order effects.

ARIMA models are useful for analyzing and forecasting data where values depend on previous values, such as monthly sales figures, stock prices, or temperature records. They are especially powerful when the data does not follow a clear seasonal pattern (non-seasonal ARIMA). If seasonality is present, analysts often use an extension called SARIMA (Seasonal ARIMA).

ARIMA Model

An ARIMA model is typically written as ARIMA(p, d, q), where:

  • p is the number of autoregressive terms.
  • d is the number of times the data must be differenced to become stationary.
  • q is the number of lagged forecast errors in the model.

An ARIMA(p, d, q) model is like a custom recipe for forecasting a time series.

  • Differencing (d) first removes trend-driven non-stationarity, giving you stable “ingredients.”
  • The autoregressive part (p) folds in lagged values of the series (how yesterday influences today).
  • The moving average part (q) stirs in lagged forecast errors to mop up leftover serial correlation.

The model estimates constant coefficients (ϕ’s and θ’s) that best fit the data. Ideally, the finished model captures the series’ dependence structure and can project it forward to generate forecasts.

Checking Assumptions

After fitting an ARIMA model, checking the residual plots is crucial. The model must capture all the structure in the data, leaving behind only white noise (e.g., random, no patterns) in the residuals. Ideally, the residuals should have no autocorrelation, a constant variance, a mean near zero, and a reasonably normal shape, especially if you plan to use confidence intervals or prediction intervals.

To assess these assumptions, analysts typically inspect residual plots, examine ACF and PACF plots, and run a Ljung–Box test to check for independence. A Q-Q plot can help evaluate the normality of the residuals. If any of these diagnostics show a problem, you may need to adjust the model—such as changing the number of AR or MA terms, applying additional differencing, or using a variance-stabilizing transformation. After modifying the model, you refit it and recheck the diagnostics until the residuals behave appropriately.

ARIMA Example

For example, a company might use an ARIMA model to forecast next quarter’s revenue based on past trends and fluctuations in previous quarters. By capturing both trend and autocorrelation, ARIMA models help generate more accurate, data-driven forecasts.

Primary Sidebar

Meet Jim

I’ll help you intuitively understand statistics by focusing on concepts and using plain English so you can concentrate on understanding your results.

Read More...

Buy My Introduction to Statistics Book!

Cover of my Introduction to Statistics: An Intuitive Guide ebook.

Buy My Hypothesis Testing Book!

Cover image of my Hypothesis Testing: An Intuitive Guide ebook.

Buy My Regression Book!

Cover for my ebook, Regression Analysis: An Intuitive Guide for Using and Interpreting Linear Models.

Subscribe by Email

Enter your email address to receive notifications of new posts by email.

    I won't send you spam. Unsubscribe at any time.

    Buy My Thinking Analytically Book!

    Cover for my book, Thinking Analytically: An Guide for Making Data-Driven Decisions.

    Top Posts

    • F-table
    • Cronbach’s Alpha: Definition, Calculations & Example
    • Z-table
    • How To Interpret R-squared in Regression Analysis
    • Interpreting Correlation Coefficients
    • Box Plot Explained with Examples
    • Multicollinearity in Regression Analysis: Problems, Detection, and Solutions
    • How to Interpret P-values and Coefficients in Regression Analysis
    • Cohens D: Definition, Using & Examples
    • T-Distribution Table of Critical Values

    Recent Posts

    • Data Collection Methods: Step-By-Step Guide with Examples
    • ANOVA Calculator
    • Positive Predictive Value: Meaning, Formula, and Interpretation
    • Median Absolute Deviation Calculator
    • Median Absolute Deviation: Definition, Finding & Formula
    • Outlier Calculator

    Recent Comments

    • Skata na fas on Comparing Regression Lines with Hypothesis Tests
    • Jim Frost on Comparing Regression Lines with Hypothesis Tests
    • Skata na fas on Comparing Regression Lines with Hypothesis Tests
    • Skata na fas on Comparing Regression Lines with Hypothesis Tests
    • Jim Frost on Pareto Chart: Making, Reading & Examples

    Copyright © 2026 · Jim Frost · Privacy Policy