Time series analysis tracks characteristics of a process at regular time intervals. It’s a fundamental method for understanding how a metric changes over time and forecasting future values. Analysts use time series methods in a wide variety of contexts.
Area | Examples |
Business | Manufacturing output, sales, prices |
Economics | GDP, stock market, and other indicators |
Social Sciences | Population trends, political changes |
Medicine | Disease, birth, and death rates |
Physical Sciences | Climate and environmental changes |
In this post, I cover the basics of time series analysis. I’ll define this type of data, explain what we can learn from it, and touch on more advanced methods that I will explore in future posts. For this introduction, I focus on using time series plots to highlight what you can learn from these data.
Time Series Data
A time series is a set of measurements that occur at regular time intervals. For this type of analysis, you can think of time as the independent variable, and the goal is to model changes in a characteristic (the dependent variable).
For example, you might measure the following:
- Hourly consumption of energy
- Daily sales
- Quarterly profits
- Annual changes in a country’s population
Each of these examples tracks a single metric at regular time points. Use subject-area knowledge to choose the appropriate time interval that allows you to answer your research questions.
Compared to Panel Data
Panel data contain observations of multiple characteristics measured over time for the same set of subjects, such as people, businesses, or countries.
For panel data, you need a time value and an additional characteristic to identify a particular observation. For example, if you have panel data that tracks sales for a group of companies over time, you’ll need a time value and a company identifier to find an individual observation.
Time series data are a sub-type of the broader class of panel data. These series only track a single characteristic. Consequently, you only need the time value. For example, for the annual number of breast cancer cases time series, you just need to know the year to identify the observation.
Compared to Cross-Sectional Data
Cross-sectional data describes a set of people, items, companies, etc. at a single point in time. The goal is to determine the differences between the subjects at one time. For example, a cross-sectional study might assess wages by education level to understand the impact of education. Time does not play a role in this type of analysis.
Related post: Guide to Data Types and How to Graph Them
Goals of Time Series Analysis
Time series analysis seeks to understand patterns in changes over time. Statisticians refer to these patterns as the components of a time series and they include trends, cycles, and irregular movements. When these components exist in a time series, the model must account for these patterns to generate accurate forecasts, such as future sales, GDP, and global temperatures.
In addition to these patterns, time series models typically incorporate the fact that time flows in one direction. Past events can influence future observations but not the other way around. Additionally, events close together in time often have a stronger association than more distant observations. While these ideas are obvious to us, statisticians had to build them into how these models work.
Like all data, time series data contain random fluctuations. This randomness can obscure the underlying patterns. Smoothing techniques cancel out these fluctuations to more clearly unveil the trends and cycles.
In other posts, I cover the modeling and smoothing techniques.
Related posts: Using Moving Averages to Smooth Time Series Data and Exponential Smoothing for Time Series Forecasting
Using Graphs to Understand the Components of a Time Series
For this introductory post, we’ll stick with the simple time series plot, and save the smoothing and modeling for later posts. Despite its simplicity, these graphs effectively illustrate how metrics change over time.
You can improve your understanding of the following components by assessing additional graphs, such as the autocorrelation and partial autocorrelation functions.
Time series plots are a specialized type of line chart.
Trends
Trends are a long-term tendency of a time series to either increase or decrease. For example, the two plots below show trends over time.
At a glance, we can determine that air pollution deaths are decreasing over the years. On the other hand, cases of breast cancer in men and women are increasing over the years. If we need to generate forecasts for future years, our model would include these trends.
Seasonal Cycles
Seasonal cycles are patterns that repeat within a time series. Typically, they have a fixed length. Understanding seasonal cycles requires using your subject-area knowledge. For example, an analyst might note a weekly cycle where weekends tend to have higher retail sales than weekdays due to a higher number of store patrons. Alternatively, sales of summer products will tend to peak in the summer months and decline in the other months. To model these data accurately, analysts need to understand these cycles.
The examples below illustrate two datasets with seasonal cycles.
In the plot above, food production by month has a repeating cycle. However, in these data, there is no overall trend. The cycle repeats but there is no long term tendency. To generate forecasts for these data, our model would need to account for the cycle but not worry about a longer-term increase or decrease.
The graph for trade activity by month displays both a seasonal cycle and a long-term trend. There’s a repeating pattern and a tendency to increase over time. For these data, our model would need to account for both components.
While time series plots are straightforward, they can yield a great deal of information about how a metric changes over time.
Saumya says
Hi Jim, I have learnt a lot from these posts of yours. I have a confusion, say you have to choose between the frequency of time-series to use for your linear regression model – daily, weekly or monthly. Would you say your choice can affect the perfromace of your model. What is the relevance of time-dimension granularitty of time-series data on the perfromace of say a multiple linear regression model.
Berns Buenaobra says
What would also be interesting is if there is some way to extract an not so obvious seasonality when data is obscured with random noise?
Petra Valickova says
Hi Jim, I wish I had known you before!
One question, thinking about a classical growth regression… In this case a classical growth regression augmented for a measure of financial development (G = α+ βF+Xγ+ δ+η+ϵ), where G stands for GDP growth rate or per capita GDP growth rate, F represents a measure of financial development (some measure of the size or activity of financial intermediaries or some measure of stock market development) and X is a vector of control variables to account for other factors considered important in the growth process. Regarding model specification, I’ve seen some researchers using nominal GDP growth rate – i.e. these researchers did not adjust for inflation to derive real GDP growth rate. Can you think about a reason why this is the case? Could it be for example if you have the measure of financial development also in nominal terms, then your dependent variable should also be in nominal terms (i.e. growth rate of nominal GDP)? Or maybe for some econometric techniques nominal terms are needed?
Thank you!!
Petra
Jim Frost says
Hi Petra,
That is a bit puzzling. Perhaps they incorporate inflation into the model using some other variable? Economic analysis is my area of expertise so I don’t have much insight.
Khyati says
Hey Jim, great article. I love how you explain difficult concepts in a lucid manner.
Could you please explain Time series models such as AR, MA, ARMA, ARCH, GARCH and ARIMA?
Would be really helpful, thanks.
Jim Frost says
Hi Khyati,
Thanks so much! I will explain the other time series models. I’m building up the pieces as I go forward. Note that I do explain MA (moving average). Stayed tuned for the other types of models!
Tony says
Hi Jim,
Yes, it helps. I appreciate it.
Thanks,
Tony
jeremy says
Hi Jim, what if you’re measuring, say, cholesterol levels in the same group of people over time, but some of them have missing data at some observation points because they didn’t attend their appointments. With some methods (like ANOVA I believe) you’d have to throw out all the observations for persons with any missing data. Would missing data be a problem with the methods you describe here?
Jim Frost says
Hi Jeremy,
Yes, you can’t have any missing values for these methods. However, if you don’t have too many missing, you can use moving averages or exponential smoothing to estimate the missing values. It’s not ideal but it can help! There are even some specialized missing value imputation methods that you can use. I haven’t used those specifically but I have used averages and even regression analysis to estimate missing values.
Tony says
Hi Jim,
This was a great read. Could exponential smoothing or a regression analysis be used to forecast if a client is going to default on invoice payment terms? If so, how would the data be set up for both exponential smoothing and a regression analysis?
Jim Frost says
Hi Tony,
That sounds like a binary logistic regression model to me. You’d use default/non-default as the binary dependent variable. Then include whatever independent variables you have, allowing you to predict the probability that a client will default. For exponential smoothing, you could track defaults overtime and predict the number of defaults overall, but it wouldn’t be client specific. The type of exponential smoothing depends on the characteristics of the data, as I show in this post.
I hope that helps!
Priyalal says
Pls include Eviews software analysis practical example too time series models (ARIMA, GARCH ,VAR and VEC )
Perry Gonen says
I second everyone who requested a book on time series. ARIMA, GARCH, EGARCH. Dynamic models and Stationary, AR, ARDL model, VAR and VEC models
Jim Frost says
Hi Perry, thanks for your enthusiasm! I think it’s safe to say that I’ll be writing a time series book covering a number of those models.
Jerry says
I second that emotion !
Jacob Willie says
Great post, thanks!
Oladayo says
Thanks Jim. Very helpful, I got wide knowledge now.
Thin Thin Swe says
Thank you. Nice post!!
hayet says
bonjour . pour les série journalier et les model arch et garch et ces application dans logiciel eviews
Ramesh Chandra Das says
Nice post!! Kindly write a time series book using ARIMA, GARCH, EGARCH. It would be more beneficial for us.
Janine Zitianellis says
I second that!
Fedor Tarasov says
Hi, can you please publish book on timseries analysis? ARIMA GARCH etc.
Otieno Joyce Akinyi says
Informative post. Looking forward to more
Otieno Joyce Akinyi says
I think in times series data, the characteristic is measured in equal time gaps; while in longitudinal analysis you have repeated measures of the characteristic.
Carlos Mora says
What software do you recommend for time series analysis? I use Statgraphics, Centurion, because it has an interactive interface.
Jane says
Hi, Jim,
I am often asked about using time series vs. longitudinal analysis when there are measurements over time (maybe 3-6 measurements). I was taught that time series data usually covered many measurements over a short period, as in stock market data that may change minute by minute or day by day so that trends can be forecast over a few weeks or months, while longitudinal analysis was more appropriate for fewer measurements over a longer period (growth, for example). Your examples extend over 5 years. Obviously my understanding is not right. How would you distinguish the two approaches? Thanks in advance for straightening me out on that question.
Guruprasad says
Can you bring a book on Analysing Time Series, it will great help….
Ernest Ng'ang'a says
Great post. How do you design a data collection sheet to allow one to collect a time series data?