Time series analysis tracks characteristics of a process at regular time intervals. It’s a fundamental method for understanding how a metric changes over time and forecasting future values. Analysts use time series methods in a wide variety of contexts.
|Business||Manufacturing output, sales, prices|
|Economics||GDP, stock market, and other indicators|
|Social Sciences||Population trends, political changes|
|Medicine||Disease, birth, and death rates|
|Physical Sciences||Climate and environmental changes|
In this post, I cover the basics of time series analysis. I’ll define this type of data, explain what we can learn from it, and touch on more advanced methods that I will explore in future posts. For this introduction, I focus on using time series plots to highlight what you can learn from these data.
Time Series Data
A time series is a set of measurements that occur at regular time intervals. For this type of analysis, you can think of time as the independent variable, and the goal is to model changes in a characteristic (the dependent variable).
For example, you might measure the following:
- Hourly consumption of energy
- Daily sales
- Quarterly profits
- Annual changes in a country’s population
Each of these examples tracks a single metric at regular time points. Use subject-area knowledge to choose the appropriate time interval that allows you to answer your research questions.
Compared to Panel Data
Panel data contain observations of multiple characteristics measured over time for the same set of subjects, such as people, businesses, or countries.
For panel data, you need a time value and an additional characteristic to identify a particular observation. For example, if you have panel data that tracks sales for a group of companies over time, you’ll need a time value and a company identifier to find an individual observation.
Time series data are a sub-type of the broader class of panel data. These series only track a single characteristic. Consequently, you only need the time value. For example, for the annual number of breast cancer cases time series, you just need to know the year to identify the observation.
Compared to Cross-Sectional Data
Cross-sectional data describes a set of people, items, companies, etc. at a single point in time. The goal is to determine the differences between the subjects at one time. For example, a cross-sectional study might assess wages by education level to understand the impact of education. Time does not play a role in this type of analysis.
Related post: Guide to Data Types and How to Graph Them
Goals of Time Series Analysis
Time series analysis seeks to understand patterns in changes over time. Statisticians refer to these patterns as the components of a time series and they include trends, cycles, and irregular movements. When these components exist in a time series, the model must account for these patterns to generate accurate forecasts, such as future sales, GDP, and global temperatures.
In addition to these patterns, time series models typically incorporate the fact that time flows in one direction. Past events can influence future observations but not the other way around. Additionally, events close together in time often have a stronger association than more distant observations. While these ideas are obvious to us, statisticians had to build them into how these models work.
Like all data, time series data contain random fluctuations. This randomness can obscure the underlying patterns. Smoothing techniques cancel out these fluctuations to more clearly unveil the trends and cycles.
In other posts, I cover the modeling and smoothing techniques.
Using Graphs to Understand the Components of a Time Series
For this introductory post, we’ll stick with the simple time series plot, and save the smoothing and modeling for later posts. Despite its simplicity, these graphs effectively illustrate how metrics change over time.
You can improve your understanding of the following components by assessing additional graphs, such as the autocorrelation and partial autocorrelation functions.
Time series plots are a specialized type of line chart.
Trends are a long-term tendency of a time series to either increase or decrease. For example, the two plots below show trends over time.
At a glance, we can determine that air pollution deaths are decreasing over the years. On the other hand, cases of breast cancer in men and women are increasing over the years. If we need to generate forecasts for future years, our model would include these trends.
Seasonal cycles are patterns that repeat within a time series. Typically, they have a fixed length. Understanding seasonal cycles requires using your subject-area knowledge. For example, an analyst might note a weekly cycle where weekends tend to have higher retail sales than weekdays due to a higher number of store patrons. Alternatively, sales of summer products will tend to peak in the summer months and decline in the other months. To model these data accurately, analysts need to understand these cycles.
The examples below illustrate two datasets with seasonal cycles.
In the plot above, food production by month has a repeating cycle. However, in these data, there is no overall trend. The cycle repeats but there is no long term tendency. To generate forecasts for these data, our model would need to account for the cycle but not worry about a longer-term increase or decrease.
The graph for trade activity by month displays both a seasonal cycle and a long-term trend. There’s a repeating pattern and a tendency to increase over time. For these data, our model would need to account for both components.
While time series plots are straightforward, they can yield a great deal of information about how a metric changes over time.