Introduction to Time Series Forecasting in Python
Introduction
Time series data is data gathered on the same subject over time, such as a country’s annual GDP, a stock price of a company over time, or your own heartbeat recorded at each second. In fact, anything that you can capture continuously at different time intervals is a time series data.
The chart below shows the daily stock price of Tesla Inc. (Ticker Symbol: TSLA) for the past year as an example of time series data. The value in US$ is shown by the y-axis on the right-hand side (the last point on the chart, $701.91, represents the current stock price as of the writing of this article on April 12, 2021).
Cross-sectional data, on the other hand, refers to datasets that hold information at a single moment in time, such as customer information, product information, company information, and so on.
An example of a dataset that records America’s best-selling electric automobiles in the first half of 2020 may be seen below. Rather than monitoring the number of automobiles sold over time, the graphic below compares the sales of different cars such as Tesla, Chevy, and Nissan over the same time period.
The distinction between cross-sectional and time-series data is easy to spot since the analytic goals for both datasets are vastly different. We were interested in watching Tesla’s stock price through time in the first study, but in the second, we wanted to look at various firms in the same time period, i.e. the first half of 2020.
A typical real-world dataset, on the other hand, is likely to be a hybrid. Consider a retailer such as Walmart, which sells thousands of things each day. A cross-sectional study is when you look at sales by product on a certain day, for example, if you want to know what the best-selling item is on Christmas Eve. In contrast, if you want to know the sales of a single item, such as the PS4, over a…