Exponential smoothing is a forecasting method for univariate time series data. This method produces forecasts that are weighted averages of past observations where the weights of older observations exponentially decrease. Forms of exponential smoothing extend the analysis to model data with trends and seasonal components.
Statisticians began developing exponential smoothing back in the 1950s. Since then, it has enjoyed a very successful presence among analysts as a quick way to generate accurate forecasts in diverse fields, particularly in industry. It’s also used in signal processing to smooth signals by filtering high-frequency noise.
In this post, I show you how to use various exponential smoothing methods, including those that can model trends and seasonality. These methods include simple, double, and triple (Holt-Winters) exponential smoothing. Additionally, I help you specify parameter values to improve your models. We’ll work through example data sets and make forecasts!
Benefits of Exponential Smoothing
By adjusting parameter values, analysts can change how quickly older observations lose their importance in the calculations. Consequently, analysts can tweak the relative importance of new observations to older observations to meet their subject area’s requirements.
In contrast, the moving average method weights all past observations equally when they fall within the moving average window and it gives observations outside the window zero weight. Like the Box-Jenkins ARIMA methodology, statisticians refer to exponential smoothing as an ETS model because it models error, trend, and seasonality in time series data.
Related post: Time Series Analysis Introduction
Simple Exponential Smoothing (SES)
Use simple exponential smoothing for univariate time series data that do not have a trend or seasonal cycle. Analysts also refer to it as single exponential smoothing. It’s the simplest form of exponential smoothing and a great place to start!
Simple exponential smoothing estimates only the level component. Think of the level component as the typical value or average. This method updates the level component for each observation. Because it models one component, it uses only one weighting parameter, alpha (α). This value determines the degree of smoothing by changing how quickly the level component adjusts to the most recent data.
Alpha values can range from 0 to 1, inclusive. Lower values produce smoother fitted lines because they give more weight to past observations, averaging out fluctuations over time. Higher values create a more jagged line because they weigh current data more highly, which reduces the degree of averaging by the older data.
While reacting quickly to changing conditions sounds like a positive attribute, setting an overly high alpha smoothing constant can produce erratic forecasts because the model responds to random fluctuations (noise). Conversely, an alpha that is too low causes a lag between changing conditions and when they impact the forecasts.
A weight of 1 causes the most recent observation to have all the weight, while all previous observations have no impact on the model. Consequently, the forecast values for α = 1 are simply the current value, which analysts refer to as naïve forecasting.
In terms of forecasting, simple exponential smoothing generates a constant set of values. All forecasts equal the last value of the level component. Consequently, these forecasts are appropriate only when your time series data have no trend or seasonality.
Use the autocorrelation and partial autocorrelation functions to help you understand the trend, seasonality, and other patterns in your data.
Selecting Your Alpha Value
Analysts can use their judgment to select the value for alpha. Typically, you want to smooth the observations to reduce the irregular fluctuations (noise) and capture the underlying pattern. However, you don’t want to smooth too much and lose relevant information! However, use subject-area knowledge and industry standards when choosing alpha. A common default value is α = 0.2.
Alternatively, allow your statistics program to optimize the parameter value by estimating it from your data while minimizing the sum of squared errors, similar to regression methodology. Using this method, you’ll obtain the value that best fits the entirety of your dataset. I recommend using the optimization method unless your study area, previous studies, or industry use a specific smoothing value.
Note: Alpha in the exponential smoothing context has no relationship to alpha in hypothesis testing.
Example of Simple Exponential Smoothing
For this example, we’ll use simple exponential smoothing to model the demand for a product. To start, I’ll illustrate how changing alpha affects your results. In the time series plots below, I use an alpha of 0.2 in the top graph and 0.8 in the lower chart.
Download the CSV file that contains all the time series data for the examples in this post: ExponentialSmoothing.
Notice how the time series plot using 0.8 has a more jagged fitted line (red) than the other graph. This line adjusts to the changing conditions more rapidly. Similarly, forecasts from this model place more weight on recent observations than the other model.
Let’s look at the Accuracy Measures. I’ll define these measures in a later post, but lower values represent a better fitting model.
Based on the accuracy measures, the model using an alpha of 0.8 provides a better fit.
Now, let’s allow the software to find the optimized alpha parameter and generate forecasts.
The software estimates that the optimal alpha smoothing constant is 0.834. Unsurprisingly, the accuracy measures are even lower (better) for this model than for the previous two models. The forecasts (green diamonds) are all constant values at the final estimate of the level component. The prediction intervals indicate the uncertainty surrounding the predictions.
As we move on to double and triple exponential smoothing, notice how each method adds components to the model, extending its functionality.
Double Exponential Smoothing (DES)
Double exponential smoothing can model trend components and level components for univariate times series data. Trends are slopes in the data. This method models dynamic gradients because it updates the trend component for each observation. To model trends, DES includes an additional parameter, beta (β*). Double exponential smoothing is also known as Holt’s Method.
As with alpha, beta can be between 0 and 1, inclusive. Higher values place more weight on recent observations, allowing the trend component to react more quickly to changes in the trend.
Forecasts for this method change at a constant rate equal to the final value of the trend component. A popular extension for this method adds a dampening component to the forecasts, causing the forecasts to level out over time to avoid overly optimistic long-term forecasts.
In the example below, we’re using double exponential smoothing to model monthly computer sales. As you can see in the chart, the time series data have a trend. I’ve allowed the software to estimate the level (0.599) and trend (0.131) smoothing constants from the data to optimize the fit.
The forecasts (green diamonds) increase at a rate equal to the final trend estimate. Notice how the prediction intervals widen with subsequent forecasts.
Note: For unknown reasons, my software creates graphs displaying symbols for the trend and seasonality constants that are not consistent with other sources.
Triple Exponential Smoothing (Holt-Winters Method)
Triple exponential smoothing can model seasonality, trend, and level components for univariate time series data. Seasonal cycles are patterns in the data that occur over a standard number of observations. Triple exponential smoothing is also known as Holt-Winters Exponential Smoothing.
This method adds in the gamma (γ) parameter to account for the seasonal component. For this method, you must specify the period for the seasonal cycle. For example, these lengths include the following: weekly (7), monthly (12), or quarterly (4).
In triple exponential smoothing, seasonality can be multiplicative or additive. Multiplicative seasonality has a pattern where the magnitude increases when the data increase. Additive seasonality reflects a seasonal pattern that has a constant scale even as the observations change.
In the example below, we’re using triple exponential smoothing to model daily food sales. The time series plot displays an upward trend and a weekly seasonality.
Notice how the forecasts (green diamonds) increase at a rate equal to the final trend estimate and contain the shape of the data’s seasonality.
When you need to generate forecasts for time series data using an easy-to-use model, consider one of these exponential smoothing methods!
Agbodah Kobina says
very good writeup. Keep it up Jim.
This could be very useful for forecasting COVID-19 cases, hospitalizations, or deaths. Look at some of the graphs at tmc.edu which combines COVID data from several hospitals. The rates appear to have a daily, weekly and perhaps seasonal pattern.
Do you discuss these techniques in any of your books?
Jim Frost says
Currently, no, I don’t discuss them in one of my books. I do plan to write a time series book eventually, and you can bet these methods will be in it, along with a bunch of others!
Hey Jim, your article was very helpfull for me. Thank you! One more question:
What to do if you have a multivariate time series?