Time Series Analysis Lecture 3 PDF
Document Details
Uploaded by FrugalGyrolite3397
Universiti Malaya
Tags
Summary
This document is a lecture on time series analysis focusing on seasonality modeling and forecasting. It details the nature of seasonality, including deterministic and stochastic components, and methods to model seasonal patterns using dummy variables. The lecture also explores applications like housing starts, highlighting seasonal effects and their impact on forecasting.
Full Transcript
EIE3002/EII3002 Time Series Analysis Lecture 3 Modeling and forecasting seasonality. 1. The Nature and Sources of Seasonality A seasonal pattern is one that repeats itself every year. The annual repetition can be exact, in which case we speak of deterministic seas...
EIE3002/EII3002 Time Series Analysis Lecture 3 Modeling and forecasting seasonality. 1. The Nature and Sources of Seasonality A seasonal pattern is one that repeats itself every year. The annual repetition can be exact, in which case we speak of deterministic seasonality, or approximate, in which case we speak of stochastic seasonality. In this topic, we focus on deterministic seasonality. Seasonality arises from links of technologies, preferences, and institutions to the calendar. The weather is a trivial but very important seasonal series, as it’s always hotter in the summer than in the winter. Any technology that involves the weather, such as production of agricultural commodities, is likely to be seasonal as well. Preferences may also be linked to the calendar. For example, gasoline sales in Figure 6.1. Figure 6.1 Gasoline Sales 14,000 13,000 12,000 Gasoline Sales 11,000 10,000 9,000 8,000 7,000 6,000 80 81 82 83 84 85 86 87 88 89 90 91 Time The figure shows monthly U.S. current- dollar gasoline sales, Jan 1980 - Jan 1992. It explains that people want to do more vacation travel in the summer, which tends to increase both price and quantity of summertime gasoline sales, both of which feed into higher current-dollar sales. Finally, social institutions that are linked to the calendar, such as holidays, are responsible for seasonal variation in a variety of series. Purchase of retail goods skyrocket, for example, every Christmas season in some western countries. For example, monthly U.S. current-dollar liquor sales, Jan 1980 - Jan 1992, which are very high in November and December. Figure 6.2 Liquor Sales 2,800 2,600 2,400 2,200 Liquor Sales 2,000 1,800 1,600 1,400 1,200 80 81 82 83 84 85 86 87 88 89 90 91 Time Another example, sales of durable goods, which fall in December, as holiday purchase tend to be nondurables. This is given in Figure 6.3, in which show monthly U.S. current-dollar durable goods sales, Jan 1980 - Jan 1992. Figure 6.3 Durable Goods Sales 70,000 60,000 Durable Goods Sales 50,000 40,000 30,000 20,000 80 81 82 83 84 85 86 87 88 89 90 91 Time One way to deal with seasonality in a series is simply to remove it and then to model and forecast the seasonally adjusted time series. This strategy is appropriate in certain situations, such as when interest centers explicitly on forecasting non-seasonal fluctuations. Seasonal adjustment is often inappropriate in business forecasting situations, precisely because interest typically centers on forecasting all the variation in a series, not just the non-seasonal part. 2. Modeling Seasonality A key technique for modeling seasonality is regression on seasonal dummies. Let s be the number of seasons in a year. Normally we’d think of four seasons in a year; but that notation is too restrictive for our purpose. Instead, think of s as the number of observations on a series in each year. Thus, s=4 if we have quarterly data, s=12 if we have monthly data, s=52 if we have weekly data, and so forth. Let’s construct seasonal dummy variables, which indicate which season we’re in. If, for example, there are four seasons, we create D1 = (1,0,0,0,1,0,0,0,1,0,0,0,…); D2 = (0,1,0,0,0,1,0,0,0,1,0,0,…); D3 = (0,0,1,0,0,0,1,0,0,0,1,0,…); D4 = (0,0,0,1,0,0,0,1,0,0,0,1,…); D1 indicates whether we in the first quarter (it’s 1 in the first quarter and 0 otherwise), D2 indicates whether we’re in the second quarter (it’s 1 in the second quarter and 0 otherwise), and so on. The pure dummy model is 𝑠 𝑦 𝑡 =∑ 𝛾𝑖 𝐷 𝑖𝑡 + 𝜀𝑡 𝑖=1 Effectively, we’re just regressing on an intercept, but we allow for a different intercept in each season, which called the seasonal factors. In the absence of seasonality, the are all the same, so we can drop all the seasonal dummies and instead simply include an intercept in the usual way. Instead of including a full set of s seasonal dummies, we can include any (s-1) seasonal dummies and an intercept. Then the constant term is the intercept for the omitted season, and the coefficients on the seasonal dummies give the seasonal increase or decrease relative to the omitted season. However, we should not include s seasonal dummies and an intercept. Inclusion of an intercept and a full set of a seasonal dummies produces perfect multicollinearity. Trend may be included as well. Trend could be in form of linear, quadratic, exponential and etc. As example, model with linear trend is represented as: Here we want our model to account for trend, if it’s present, but we want to expand the model so that we can account for seasonality as well. The idea of seasonality may be extended to allow for more general calendar effects. Standard seasonality is just one type of calendar effect. Two additional calendar effects are holiday variation and trading-day variation. Holiday variation refers to the fact that some holiday’s dates change over time. Ester as an example, arrives at approximately the same time each year, but the exact dates differ. The behaviour of many series, such as sales, shipments, inventories, and so on depends in part on timing of such holiday. Thus, we may want to keep track of them in our forecasting models. Holiday effects may be handled with dummy variables. In a monthly model, for example, in addition to a full set of seasonal dummies, we might include an “Ester dummy”, which is 1 if the month contains Ester and 0 otherwise. Trading variation refers to the fact that different months contains different numbers of trading days or business days, which is an important consideration when modeling and forecasting certain series. For example, in a monthly forecasting model of volume traded on the London Stock Exchange, in addition to a full set of seasonal dummies, we might include a trading-day variable, whose value each month is the number of trading days that month. Inclusion of holiday and trading-day variation gives the complete model s v1 v2 yt 1t i Dit iHD HDVit iTDTDVit t i 1 i 1 i 1 where the HDVs are the relevant holiday variables, and the TDVs are the relevant trading-day variables. In most applications, v2=1 will be adequate. This is a standard regression equation and can be estimated by ordinary least squares. 3. Forecasting Seasonal Series The full model is s v1 v2 yt 1t i Dit iHD HDVit iTDTDVit t i 1 i 1 i 1 so that at time T+h [where T = sample size], s v 1 yT h 1 (T h) i Di ,T h i HDVi ,T h HD i 1 i 1 v2 TDVi ,T h T h i TD i 1 Point forecast: s v1 yˆT h|T ˆ1 (T h) ˆi Di ,T h ˆiHD HDVi ,T h i 1 i 1 v2 ˆ TD TDV i i ,T h i 1 95% interval forecast: yˆT h|T 1.96ˆ where ˆ is the forecast standard error. 2 * under assumption thatt ~ N ( 0, ). t 4. Application: Forecasting Housing Starts Figure 6.4 shows monthly data on U.S. housing starts between Jan 1946 – Nov 1994, which are seasonal because it’s usually preferable to start houses in the spring, so that they’re completed before winter arrives. We’ll use Jan 1946 – Dec 1993 for estimation, Jan 1994 – Nov 1994 for out-of-sample forecasting. Figure 6.4 Housing Starts, 1946.01-1994.11 250 200 150 Starts 100 50 0 50 55 60 65 70 75 80 85 90 Time Here we zoom in on the Jan 1990 – Nov 1994 to inspect for seasonal pattern. Figure 6.5 Housing Starts, 1990.01-1994.11 160 140 120 Starts 100 80 60 40 90:01 91:01 92:01 93:01 94:01 Time The figures reveal that there is no trend, thus it is adequate to model the series using the pure seasonal model that given by, s yt i Dit t i 1 Table 6.1 shows the estimation results. The 12 seasonal dummies account for more than a third of the variation in housing starts, as R2 = 0.38. At least some of the remaining variation is cyclical, which the model not designed to capture. Table 6.1; Regression results: Seasonal dummy variable model, housing starts. Notice that it has very low Durbin- Watson statistic (see appendix). The residual plot in Figure 6.6 makes clear the strength and limitations of the model. There is nothing in the model other than deterministic seasonal pattern every year – it picks up a lot of the variation in housing starts, but it however, doesn’t pick up all of the variation as evidenced by the serial correlation that’s apparent in the residuals. Figure 6.6 Housing Starts: Pure Seasonal Model Residual Plot 250 200 150 100 100 50 50 0 0 -50 -100 50 55 60 65 70 75 80 85 90 Residual Actual Fitted The estimated seasonal factors are just the 12 estimated coefficients on the seasonal dummies as given in Figure 6.7. The seasonal effects are very low in January and February, and then rise quickly and peak in May, after which they decline, at first slowly and then abruptly in November and December. Figure 6.7 Estimated Seasonal Factors: Housing Starts 160 150 140 Seasonal Factors 130 120 110 100 90 80 1 2 3 4 5 6 7 8 9 10 11 12 Month Figures 6.9 gives the history of housing starts through 1993, together with the out- of-sample point and 95% interval extrapolation forecasts for the first 11 months of 1994. Forecasts look reasonable, as the model evidently done a good job of capturing the seasonal pattern. The forecast intervals are quite wide, reflecting the fact that the seasonal effects captured by the forecasting model are responsible for only about a third of the variation in the variable being forecast. The forecast appears highly accurate as the realization and forecast are quite Figure 6.9 Housing Starts: History, 1990.01-1993.12 Forecast and Realization, 1994.01-1994.11 History, Forecast and Realization 250 200 150 100 50 0 90:01 91:01 92:01 93:01 94:01 Time Appendix : The Durbin-Watson statistic tests for correlation over time, called serial correlation, in regression disturbances. The Durbin-Watson test works within the context of the regression model yt 0 1 X 1,t k X k ,t t t t 1 vt , vt N 0, 2 The regression disturbance is serially 0. when correlated Hypothesis: H 0 : 0 H1 : 0 Durbin-Watson statistic is T e e 2 t t 1 DW t 2 T e t 1 2 t DW takes values in the interval [0,4], and if all is well, DW should be around 2. If DW is substantially less than 2, there is evidence of positive autocorrelation. If DW is substantially greater than 2, there is evidence of negative autocorrelation.