Description
This is a presentation explaining about time series forecasting, autocorrelation.
Business Forecasting
for Actionable Information
Forecasting
• Why forecast? Future is uncertain – Reduce Uncertainty to allow for better decision by management – FUTURE as EXTENSION of PAST • Primary Rules of Forecasting – Technically correct and produce forecast accurate enough for the purpose – Justifiable on cost-benefit basis – Sellable to Management – Usable with ease by Management • Forecasting is the art or science of predicting the future. • It is used in the decision making process to help analysts reach conclusions about buying, selling, producing, hiring and many other options.
Forecasting
Time Series
• A time series is made up of the value for a variable recorded at regular time intervals. • The time interval can be years, quarters, months, weeks, days, or any other length of time that is important. • For example, the marketing research department for a company might record the company's sales of a product on a daily basis. These daily time series values could then be combined for two-week periods to create a biweekly time series and so on.
Components of Time Series
• Most time series techniques consider the time series to be made up of four components. – Trend - a long-term upward or downward change in the time series. – Seasonal - periodic increases or decreases that occur within a
year.
– Cyclical -
increases and decreases that occur over more than a
one-year period.
– Irregular - changes in the time series not attributable to the other
three components.
• A time series that does not include a trend, seasonal, or cyclical component is called a stationary time series.
Time Series Plots
Time Series Plots
Time Series Plots
These time series suggest the following model, namely, Yt = Tt + St + Ct +It
Steps in Time Series Forecasting
• • • • Step 1 : Identify Time Series Form Step 2 : Select Potential Methods Step 3 : Evaluate Potential Methods Step 4 : Make Required Forecasts
Step 1 : Identify Time Series Form
• Irregular Component : All business time series data
is considered to have an irregular component. This is the unpredictable component caused by random influences on the time series values. Since it can not be predicted, forecasting methods attempt to eliminate its effect by averaging or smoothing.
• Cyclical Component : In order to identify the
presence of this component, it is necessary to have a minimum of two repetitions of the cycle, although three or more would be preferred. (Note that there may not be time specific interval when this happens, e.g. Inflation, Bull Period in Stock market, and hence need other variables to predict)
Identify Time Series Form
• Trend Component : To
determine the presence or absence of a trend component, one should always plot time series data. A more formal method of investigating the presence or absence of a trend is to use regression analysis to fit a trend line to the data.
Regression Analysis to Fit Trend Line
Regression Analysis to Fit Quadratic Trend
The plot suggests that a nonlinear trend may be present. A non-linear quadratic form of the regression can be found.
• Seasonal Component
Identify Time Series Form
The detection of the presence of a seasonal component requires at least two years worth of data. Also, the data must be recorded by time periods less than a year, such as quarters, months, weeks or days. • For each year the plot is made separately.
Autocorrelation
• The second means for investigating the presence or absence of seasonal variation is to use an aspect of regression and correlation analysis called autocorrelation. • Data values for a variable over time are often correlated with prior values for that variable. This condition is called autocorrelation or serial autocorrelation. • The correlation between time series values that are k periods apart is called the autocorrelation of lag k. • To detect a seasonal component, we can check the autocorrelation for:
– – – – lags of 7 for daily data, of 52 for weekly data, of 12 for monthly data, and of 4 for quarterly data.
• In addition, an autocorrelation of a lag 1 is an indication of the presence of a trend component. • The autocorrelation of lag k is computed using the formula…
Autocorrelation Example
Note that r4 is statistically significant as | r4 | = 0.6988 > (2/^20) 0.447. This suggests that the seasonality component is present in the series. Also, the low and non-significant value for r, indicates a trend component is not present.
ACF
1 0.8 0.6 0.4 0.2 0 -0.2 -0.4 -0.6 -0.8 -1
0 4 8 12 16 20
ACF
24 28 32 36 40
Lags
ACF Plot for Value
UCI LCI
44 48 52 56 60
The plots indicates: • Trend is present in TS • Seasonality is NOT present • Data in NOT Random
100000 120000 20000 40000 60000 80000 0
1988-1 1988-5 1988-9 1989-1 1989-5 1989-9 1990-1 1990-5 1990-9 1991-1 1991-5 1991-9 1992-1 1992-5 1992-9 1993-1 1993-5 1993-9 1994-1 1994-5 1994-9 1995-1 1995-5 1995-9 1996-1 1996-5 1996-9 1997-1 1997-5 1997-9 1998-1 1998-5 1998-9 1999-1 1999-5 1999-9 2000-1 2000-5 2000-9 2001-1
PACF
1 0.8 0.6 0.4 0.2 0 -0.2 -0.4 -0.6 -0.8 -1
1 5 9 13 17
Value
21
PACF Lags UCI LCI
25 29 33 37 41 45 49 53
Value
ACF and PACF Plots Correlogram
PACF Plot for Value
57
ACF Plot for Sales
1 0.8 0.6 0.4 0.2 0 -0.2 -0.4 -0.6 -0.8 -1
Pattern of No Trend but Seasonality
1 0.8 0.6 0.4 0.2 0 -0.2 -0.4 -0.6 -0.8 -1
PACF Plot for Sales
PACF
1
4
7
10
13
16
19
22
25
28
31
34
37
40
43
ACF
Lags PACF UCI LCI
0
3
6
9
12
15
18
21
24
27
30
33
36
39
42
45
Chart Title 300 250 200 150 100 50 0
Lags ACF UCI LCI
Sales Linear (Sales)
Jul-95
Jul-96
Jul-97
Jan-95
Jan-96
Jan-97
Jan-98
Apr-95
Apr-96
Apr-97
Oct-95
Oct-96
Oct-97
Apr-98
Jul-98
The plots indicates: • Trend is NOT present in TS • Seasonality is present • Data in NOT Random
300 250 200 150 100 50 0 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 1995 1996 1997 1998
Oct-98
46
Step 2 : Select Potential Methods
• In this step, the knowledge about the form of the time series is used to select the forecasting methods that are potentially useful for forecasting the time series. • The schematic diagram summarizes the possible time series forms and the forecasting methods.
Step 3 : Evaluate Potential Methods
• How to assess the performance of each of these methods in forecasting the time series? • Each method is used to forecast the historical data for the time series and an evaluation of how closely these forecasts fit the actual historical data is made. • Error: The error of an individual forecast is the difference between the actual value and the forecast of that value. i.e.
et = Yt - Ft
Measures of Error
• Mean Error : ME • Mean Absolute Deviation Error : MAD • Mean Square Error : MSE • RootMeanSquareError:RMS • Percentage Error
E • Mean Percentage Error : MPE • Mean Absolute Percentage Error :
MAPE
Step 4 : Make Required Forecasts
• Let Yi denote the value of the time series at the time i = 1, 2,…,t. • Let F t+1 denote the forecast for the next time period, that for two time period into the future as Ft+2, three time periods Ft+3 and so on. • Note that for a stationary forecasting method
Ft+1 = Ft+2 = …..
• However for non-stationary time series Ft+1 ? Ft+2 ? ….. • Average Baseball Salary
Handling Seasonal Variation
Number of Guests per quarter • Step 0: plot
• Step 1: Calculate moving total over all season (e.g. for quarterly season, moving total over year) • Step 2: Calculate Moving Average
• Step 3: Center moving average • Step 3A: Plot Center Moving Average • Step 4: Calculate % of actual value to moving average
Handling Seasonal Variation (Cont)
• Step 5: Calculate Modified Mean for each season after discarding maximum and minimum
• Step 6: Adjust modified mean
• Step 7: Use Seasonal Index (de-seasonalization / seasonalization)
Handling Trend
Number of Ships Loaded year-wise
• Step 1: Plot and draw trend line
• Step 2: Coding Time • Step 3: Fit to Y = a + b*t (Linear Trend) • Step 4: Use to predict • Different equation for quadratic Trend Y = a + b*t + c*t**2
Cyclical Variation
• Find out Y/Yestimated * 100
• Above not useful for prediction
Handling All Components
SALES
• Plot
• De-seasonalize Time Series • Develop Trend Line • Smoothen Irregular Component • Forecast using Trend
• Seasonalization
• Do exercise 15-37 Levin
Forecasting Methods
1. 2. Naive Forecasting Method Moving Average Forecasting Method 3. Stationary Forecasting Methods 4. Weighted Moving Average Forecasting Method 5. Exponential Smoothing Forecasting Method 6. Trend Forecasting Methods 7. Linear Trend Projection Forecasting Method 8. Non-Linear Trend Projection Forecasting Method 9. Holt's Exponential Smoothing Forecasting Method 10. Trend Autoregressive Forecasting Method 11. Seasonal Forecasting Methods 12. Seasonal Multiple Regression Forecasting Method 13. Seasonal Autoregressive Forecasting Method 14. Winter's Exponential Smoothing Forecasting Method 15. Seasonal Autoregressive 16. Winter's Exponential Smoothing 17. The Box-Jenkins (ARIMA) Methodology
Forecasting Notations
• Yt : Value of time series at time t • Yt : Forecast value of Yt • et = Yt - Yt : Residual or Forecast error • Alfa: Smoothing constant for Data level
(value between 0 and 1)
• Beta: Smoothing constant for Trend
(value between 0 and 1)
• Gamma: Smoothing constant for Seasonality
(value between 0 and 1)
Which Technique to Use, When?
Technique Pattern of Data Time Horizon Naive S, T, ST S Smoothing ST S Linear T S Exponential Seasonal S S Exponential Exponential T I, L Trend Box-Jenkins S, T, ST, C S Pattern: S=Seasonal; T=Trend; C=Cyclical; Time: S=Short; I=Intermediate; L=Long ARIMA S, T, ST S Econometric C S
How Much Data Required?
Technique Naive Smoothing Linear Seasonal Exponential Trend Box-Jenkins (ARIMA) Econometric Non Seasonal 1 4-30 3-30 Seasonal
2xs
10 24 3xs
30
Which is the „Best Fit? Model?
Several relevant trials and evaluate results:
„Best? Result: • Least Error – Mean Error : ME – Mean Absolute Deviation Error : MAD – Mean Square Error : MSE – Percentage Error
E – Mean Percentage Error : MPE – Mean Absolute Percentage Error : MAPE Validity: • Pattern of residuals – Is it normally distributed? – Parameter estimates have significant t ratios? – Autocorrelations of residuals are outside the confidence intervals?
Time Series Forecasting
for Actionable Information
Naïve Model
• Yt+1 = Yt • This means forecast for next time is same as the current actual
^
Smoothing – Simple Average
• Takes Mean of all the relevant historical observations as the forecast for next time period.
2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2001 1 2 3 4 5 6 7 8 9 10 11 12 1 2000-1 2000-2 2000-3 2000-4 2000-5 2000-6 2000-7 2000-8 2000-9 2000-10 2000-11 2000-12 2001-1 145 146 147 148 149 150 151 152 153 154 155 156 157 158 or 158 111791 113179 112529 111202 110805 110718 111700 111268 112186 111647 110315 110349 111209 75447 111454
Smoothing – Exponential
Exponentially Weighted Moving Average.. continually revises estimates in the light of more recent experiences
Time Plot of Actual Vs Forecast (Training Data)
120000 100000
Value
80000 60000 40000 20000 0
Yt+1 = Yt + ? (Yt – Yt)
Time Actual Forecast
Error Measures (Training)
MAPE MAD MSE 1.24 894 1301998
Forecast
Time 158 159 160 161 162 Forecast 111209 111209 111209 111209 111209 LCI 108980 108980 108980 108980 108980 UCI 113438 113438 113438 113438 113438
This is a very popular scheme to produce a smoothed Time Series. Exponential Smoothing assigns exponentially decreasing weights as the observations get older. In other words, recent observations are given relatively more weight in forecasting than the older observations.
11 3
12 7
14 1
15 5
15
29
43
57
71
85
99
1
Exponential Smoothing – Adjusted for Trend – Holt?s Method
• • • • • Yt+1 (Hat)= Lt + Tt Lt = ?Yt + (1-?)(Lt-1 + Tt-1) Tt = ß(Lt – Lt-1) + (1-ß)Tt-1 ? = Smoothing constant for Data (between 0 and 1) ß = Smoothing constant for trend estimate (between 0 and 1)
Time Plot of Actual Vs Forecast (Training Data)
120000 100000
Smoothing – Double Exponential
Time Forecast
Value
80000 60000 40000 20000 0
Actual
Error Measures (Training)
MAPE MAD MSE 1.31 905 1617587
Double exponential smoothing is defined as Exponential smoothing of Exponential smoothing.
LCI 109159 109612 110065 110517 110970 UCI 114145 114598 115050 115503 115956
10 6
12 1
13 6
Forecast
Time 158 159 160 161 162 Forecast 111652 112105 112557 113010 113463
15 1
16
31
46
61
76
91
1
Exponential Smoothing Adjusted for Trend, Season– Holts Winter
Parameters/Options Alpha (Level) Beta (Trend) Gamma (Seasonality) Season length Number of seasons Forecast #Forecasts Update Estimate Each Time 0.2 0.15 0.05 12 13 Yes 5 Yes
Error Measures (Training)
MAPE MAD MSE 1.53 1121 2029959
Forecast
Time 158 159 160 161 162 Forecast 113106 113014 113177 113006 113523 LCI 110313 110221 110384 110213 110730 UCI 115898 115806 115969 115798 116315
ARIMA: Box-Jenkins
Auto Regressive Integrated Moving Average Identifying, Fitting and Checking
• Fit the model and review various results. • We can decide the quality of our model by taking a look at the Time plot of actual values Vs forecast. • We can calculate MAPE, MAD, and MSE to compare the results with those of other models. • If both the curves are close enough then the model is good. • The model should explain the trend and seasonality, if any. • If the residuals are random then our model is good. If they show some trend then we need to refine the model.
ARIMA Outputs
ACF Plot
1 0.8 0.6 0.4 0.2 0 -0.2 -0.4 -0.6 -0.8 -1
PACF Plot
1 0.8 0.6 0.4 0.2 0 -0.2 -0.4 -0.6 -0.8 -1
ACF
PACF
1
6
11
16
21
26
31
36
41
46
51
56
61
10
15
20
25
30
35
40
45
50
55
60
65
70
Lags ACF UCI LCI
Lags PACF UCI LCI
ARIMA Model
ARIMA Const. term AR1 MA1 SAR1 SMA1 Coeff 76.0158844 1.04136586 -0.04237335 0.14788581 -0.11075392 StErr 31.47613907 0 0.00478536 0.02171841 0.02918595 p-value 0.01573383 0 0 0 0.00014778
Value
Time Plot of Actual Vs Forecast (Training Data)
120000 100000 80000 60000 40000 20000 0
Error Measures (Training)
MAPE MAD MSE 0.01 871.67 1243126
10 6
12 1
13 6
Time 158 159 160 161 162
Forecast 111089.4688 110887.8594 110630.2891 110462.6641 110360.5156
95 % Confidence Interval Lower Upper 108904.1953 113274.7422 107957.3906 113818.3281 107063.5234 114197.0547 106314.7656 114610.5625 105662.9688 115058.0625
Time Actual Forecast
15 1
16
31
46
61
76
91
1
Forecast
66
0
5
ARIMA: Box-Jenkins
• Show Page 348 and 349 of “Business Forecasting” • Partial autocorrelation at time lag K is correlation between Yt and Yt-k after intervening values Yt-1, Yt-2, .. Yt-k+1 • AutoRegressive Model of Pth order: see equation 9.1 of page 351 • Moving Average Model of qth order: See Page 352
MA(q) AR(p) ARMA(p,q)
Auto correlation Cut off after q Die Out Die Out
Partial Auto Correlation Die out Cut off after p Die Out
doc_919334611.ppt
This is a presentation explaining about time series forecasting, autocorrelation.
Business Forecasting
for Actionable Information
Forecasting
• Why forecast? Future is uncertain – Reduce Uncertainty to allow for better decision by management – FUTURE as EXTENSION of PAST • Primary Rules of Forecasting – Technically correct and produce forecast accurate enough for the purpose – Justifiable on cost-benefit basis – Sellable to Management – Usable with ease by Management • Forecasting is the art or science of predicting the future. • It is used in the decision making process to help analysts reach conclusions about buying, selling, producing, hiring and many other options.
Forecasting
Time Series
• A time series is made up of the value for a variable recorded at regular time intervals. • The time interval can be years, quarters, months, weeks, days, or any other length of time that is important. • For example, the marketing research department for a company might record the company's sales of a product on a daily basis. These daily time series values could then be combined for two-week periods to create a biweekly time series and so on.
Components of Time Series
• Most time series techniques consider the time series to be made up of four components. – Trend - a long-term upward or downward change in the time series. – Seasonal - periodic increases or decreases that occur within a
year.
– Cyclical -
increases and decreases that occur over more than a
one-year period.
– Irregular - changes in the time series not attributable to the other
three components.
• A time series that does not include a trend, seasonal, or cyclical component is called a stationary time series.
Time Series Plots
Time Series Plots
Time Series Plots
These time series suggest the following model, namely, Yt = Tt + St + Ct +It
Steps in Time Series Forecasting
• • • • Step 1 : Identify Time Series Form Step 2 : Select Potential Methods Step 3 : Evaluate Potential Methods Step 4 : Make Required Forecasts
Step 1 : Identify Time Series Form
• Irregular Component : All business time series data
is considered to have an irregular component. This is the unpredictable component caused by random influences on the time series values. Since it can not be predicted, forecasting methods attempt to eliminate its effect by averaging or smoothing.
• Cyclical Component : In order to identify the
presence of this component, it is necessary to have a minimum of two repetitions of the cycle, although three or more would be preferred. (Note that there may not be time specific interval when this happens, e.g. Inflation, Bull Period in Stock market, and hence need other variables to predict)
Identify Time Series Form
• Trend Component : To
determine the presence or absence of a trend component, one should always plot time series data. A more formal method of investigating the presence or absence of a trend is to use regression analysis to fit a trend line to the data.
Regression Analysis to Fit Trend Line
Regression Analysis to Fit Quadratic Trend
The plot suggests that a nonlinear trend may be present. A non-linear quadratic form of the regression can be found.
• Seasonal Component
Identify Time Series Form
The detection of the presence of a seasonal component requires at least two years worth of data. Also, the data must be recorded by time periods less than a year, such as quarters, months, weeks or days. • For each year the plot is made separately.
Autocorrelation
• The second means for investigating the presence or absence of seasonal variation is to use an aspect of regression and correlation analysis called autocorrelation. • Data values for a variable over time are often correlated with prior values for that variable. This condition is called autocorrelation or serial autocorrelation. • The correlation between time series values that are k periods apart is called the autocorrelation of lag k. • To detect a seasonal component, we can check the autocorrelation for:
– – – – lags of 7 for daily data, of 52 for weekly data, of 12 for monthly data, and of 4 for quarterly data.
• In addition, an autocorrelation of a lag 1 is an indication of the presence of a trend component. • The autocorrelation of lag k is computed using the formula…
Autocorrelation Example
Note that r4 is statistically significant as | r4 | = 0.6988 > (2/^20) 0.447. This suggests that the seasonality component is present in the series. Also, the low and non-significant value for r, indicates a trend component is not present.
ACF
1 0.8 0.6 0.4 0.2 0 -0.2 -0.4 -0.6 -0.8 -1
0 4 8 12 16 20
ACF
24 28 32 36 40
Lags
ACF Plot for Value
UCI LCI
44 48 52 56 60
The plots indicates: • Trend is present in TS • Seasonality is NOT present • Data in NOT Random
100000 120000 20000 40000 60000 80000 0
1988-1 1988-5 1988-9 1989-1 1989-5 1989-9 1990-1 1990-5 1990-9 1991-1 1991-5 1991-9 1992-1 1992-5 1992-9 1993-1 1993-5 1993-9 1994-1 1994-5 1994-9 1995-1 1995-5 1995-9 1996-1 1996-5 1996-9 1997-1 1997-5 1997-9 1998-1 1998-5 1998-9 1999-1 1999-5 1999-9 2000-1 2000-5 2000-9 2001-1
PACF
1 0.8 0.6 0.4 0.2 0 -0.2 -0.4 -0.6 -0.8 -1
1 5 9 13 17
Value
21
PACF Lags UCI LCI
25 29 33 37 41 45 49 53
Value
ACF and PACF Plots Correlogram
PACF Plot for Value
57
ACF Plot for Sales
1 0.8 0.6 0.4 0.2 0 -0.2 -0.4 -0.6 -0.8 -1
Pattern of No Trend but Seasonality
1 0.8 0.6 0.4 0.2 0 -0.2 -0.4 -0.6 -0.8 -1
PACF Plot for Sales
PACF
1
4
7
10
13
16
19
22
25
28
31
34
37
40
43
ACF
Lags PACF UCI LCI
0
3
6
9
12
15
18
21
24
27
30
33
36
39
42
45
Chart Title 300 250 200 150 100 50 0
Lags ACF UCI LCI
Sales Linear (Sales)
Jul-95
Jul-96
Jul-97
Jan-95
Jan-96
Jan-97
Jan-98
Apr-95
Apr-96
Apr-97
Oct-95
Oct-96
Oct-97
Apr-98
Jul-98
The plots indicates: • Trend is NOT present in TS • Seasonality is present • Data in NOT Random
300 250 200 150 100 50 0 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 1995 1996 1997 1998
Oct-98
46
Step 2 : Select Potential Methods
• In this step, the knowledge about the form of the time series is used to select the forecasting methods that are potentially useful for forecasting the time series. • The schematic diagram summarizes the possible time series forms and the forecasting methods.
Step 3 : Evaluate Potential Methods
• How to assess the performance of each of these methods in forecasting the time series? • Each method is used to forecast the historical data for the time series and an evaluation of how closely these forecasts fit the actual historical data is made. • Error: The error of an individual forecast is the difference between the actual value and the forecast of that value. i.e.
et = Yt - Ft
Measures of Error
• Mean Error : ME • Mean Absolute Deviation Error : MAD • Mean Square Error : MSE • RootMeanSquareError:RMS • Percentage Error

MAPE
Step 4 : Make Required Forecasts
• Let Yi denote the value of the time series at the time i = 1, 2,…,t. • Let F t+1 denote the forecast for the next time period, that for two time period into the future as Ft+2, three time periods Ft+3 and so on. • Note that for a stationary forecasting method
Ft+1 = Ft+2 = …..
• However for non-stationary time series Ft+1 ? Ft+2 ? ….. • Average Baseball Salary
Handling Seasonal Variation
Number of Guests per quarter • Step 0: plot
• Step 1: Calculate moving total over all season (e.g. for quarterly season, moving total over year) • Step 2: Calculate Moving Average
• Step 3: Center moving average • Step 3A: Plot Center Moving Average • Step 4: Calculate % of actual value to moving average
Handling Seasonal Variation (Cont)
• Step 5: Calculate Modified Mean for each season after discarding maximum and minimum
• Step 6: Adjust modified mean
• Step 7: Use Seasonal Index (de-seasonalization / seasonalization)
Handling Trend
Number of Ships Loaded year-wise
• Step 1: Plot and draw trend line
• Step 2: Coding Time • Step 3: Fit to Y = a + b*t (Linear Trend) • Step 4: Use to predict • Different equation for quadratic Trend Y = a + b*t + c*t**2
Cyclical Variation
• Find out Y/Yestimated * 100
• Above not useful for prediction
Handling All Components
SALES
• Plot
• De-seasonalize Time Series • Develop Trend Line • Smoothen Irregular Component • Forecast using Trend
• Seasonalization
• Do exercise 15-37 Levin
Forecasting Methods
1. 2. Naive Forecasting Method Moving Average Forecasting Method 3. Stationary Forecasting Methods 4. Weighted Moving Average Forecasting Method 5. Exponential Smoothing Forecasting Method 6. Trend Forecasting Methods 7. Linear Trend Projection Forecasting Method 8. Non-Linear Trend Projection Forecasting Method 9. Holt's Exponential Smoothing Forecasting Method 10. Trend Autoregressive Forecasting Method 11. Seasonal Forecasting Methods 12. Seasonal Multiple Regression Forecasting Method 13. Seasonal Autoregressive Forecasting Method 14. Winter's Exponential Smoothing Forecasting Method 15. Seasonal Autoregressive 16. Winter's Exponential Smoothing 17. The Box-Jenkins (ARIMA) Methodology
Forecasting Notations
• Yt : Value of time series at time t • Yt : Forecast value of Yt • et = Yt - Yt : Residual or Forecast error • Alfa: Smoothing constant for Data level
(value between 0 and 1)
• Beta: Smoothing constant for Trend
(value between 0 and 1)
• Gamma: Smoothing constant for Seasonality
(value between 0 and 1)
Which Technique to Use, When?
Technique Pattern of Data Time Horizon Naive S, T, ST S Smoothing ST S Linear T S Exponential Seasonal S S Exponential Exponential T I, L Trend Box-Jenkins S, T, ST, C S Pattern: S=Seasonal; T=Trend; C=Cyclical; Time: S=Short; I=Intermediate; L=Long ARIMA S, T, ST S Econometric C S
How Much Data Required?
Technique Naive Smoothing Linear Seasonal Exponential Trend Box-Jenkins (ARIMA) Econometric Non Seasonal 1 4-30 3-30 Seasonal
2xs
10 24 3xs
30
Which is the „Best Fit? Model?
Several relevant trials and evaluate results:
„Best? Result: • Least Error – Mean Error : ME – Mean Absolute Deviation Error : MAD – Mean Square Error : MSE – Percentage Error

Time Series Forecasting
for Actionable Information
Naïve Model
• Yt+1 = Yt • This means forecast for next time is same as the current actual
^
Smoothing – Simple Average
• Takes Mean of all the relevant historical observations as the forecast for next time period.
2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2001 1 2 3 4 5 6 7 8 9 10 11 12 1 2000-1 2000-2 2000-3 2000-4 2000-5 2000-6 2000-7 2000-8 2000-9 2000-10 2000-11 2000-12 2001-1 145 146 147 148 149 150 151 152 153 154 155 156 157 158 or 158 111791 113179 112529 111202 110805 110718 111700 111268 112186 111647 110315 110349 111209 75447 111454
Smoothing – Exponential
Exponentially Weighted Moving Average.. continually revises estimates in the light of more recent experiences
Time Plot of Actual Vs Forecast (Training Data)
120000 100000
Value
80000 60000 40000 20000 0
Yt+1 = Yt + ? (Yt – Yt)
Time Actual Forecast
Error Measures (Training)
MAPE MAD MSE 1.24 894 1301998
Forecast
Time 158 159 160 161 162 Forecast 111209 111209 111209 111209 111209 LCI 108980 108980 108980 108980 108980 UCI 113438 113438 113438 113438 113438
This is a very popular scheme to produce a smoothed Time Series. Exponential Smoothing assigns exponentially decreasing weights as the observations get older. In other words, recent observations are given relatively more weight in forecasting than the older observations.
11 3
12 7
14 1
15 5
15
29
43
57
71
85
99
1
Exponential Smoothing – Adjusted for Trend – Holt?s Method
• • • • • Yt+1 (Hat)= Lt + Tt Lt = ?Yt + (1-?)(Lt-1 + Tt-1) Tt = ß(Lt – Lt-1) + (1-ß)Tt-1 ? = Smoothing constant for Data (between 0 and 1) ß = Smoothing constant for trend estimate (between 0 and 1)
Time Plot of Actual Vs Forecast (Training Data)
120000 100000
Smoothing – Double Exponential
Time Forecast
Value
80000 60000 40000 20000 0
Actual
Error Measures (Training)
MAPE MAD MSE 1.31 905 1617587
Double exponential smoothing is defined as Exponential smoothing of Exponential smoothing.
LCI 109159 109612 110065 110517 110970 UCI 114145 114598 115050 115503 115956
10 6
12 1
13 6
Forecast
Time 158 159 160 161 162 Forecast 111652 112105 112557 113010 113463
15 1
16
31
46
61
76
91
1
Exponential Smoothing Adjusted for Trend, Season– Holts Winter
Parameters/Options Alpha (Level) Beta (Trend) Gamma (Seasonality) Season length Number of seasons Forecast #Forecasts Update Estimate Each Time 0.2 0.15 0.05 12 13 Yes 5 Yes
Error Measures (Training)
MAPE MAD MSE 1.53 1121 2029959
Forecast
Time 158 159 160 161 162 Forecast 113106 113014 113177 113006 113523 LCI 110313 110221 110384 110213 110730 UCI 115898 115806 115969 115798 116315
ARIMA: Box-Jenkins
Auto Regressive Integrated Moving Average Identifying, Fitting and Checking
• Fit the model and review various results. • We can decide the quality of our model by taking a look at the Time plot of actual values Vs forecast. • We can calculate MAPE, MAD, and MSE to compare the results with those of other models. • If both the curves are close enough then the model is good. • The model should explain the trend and seasonality, if any. • If the residuals are random then our model is good. If they show some trend then we need to refine the model.
ARIMA Outputs
ACF Plot
1 0.8 0.6 0.4 0.2 0 -0.2 -0.4 -0.6 -0.8 -1
PACF Plot
1 0.8 0.6 0.4 0.2 0 -0.2 -0.4 -0.6 -0.8 -1
ACF
PACF
1
6
11
16
21
26
31
36
41
46
51
56
61
10
15
20
25
30
35
40
45
50
55
60
65
70
Lags ACF UCI LCI
Lags PACF UCI LCI
ARIMA Model
ARIMA Const. term AR1 MA1 SAR1 SMA1 Coeff 76.0158844 1.04136586 -0.04237335 0.14788581 -0.11075392 StErr 31.47613907 0 0.00478536 0.02171841 0.02918595 p-value 0.01573383 0 0 0 0.00014778
Value
Time Plot of Actual Vs Forecast (Training Data)
120000 100000 80000 60000 40000 20000 0
Error Measures (Training)
MAPE MAD MSE 0.01 871.67 1243126
10 6
12 1
13 6
Time 158 159 160 161 162
Forecast 111089.4688 110887.8594 110630.2891 110462.6641 110360.5156
95 % Confidence Interval Lower Upper 108904.1953 113274.7422 107957.3906 113818.3281 107063.5234 114197.0547 106314.7656 114610.5625 105662.9688 115058.0625
Time Actual Forecast
15 1
16
31
46
61
76
91
1
Forecast
66
0
5
ARIMA: Box-Jenkins
• Show Page 348 and 349 of “Business Forecasting” • Partial autocorrelation at time lag K is correlation between Yt and Yt-k after intervening values Yt-1, Yt-2, .. Yt-k+1 • AutoRegressive Model of Pth order: see equation 9.1 of page 351 • Moving Average Model of qth order: See Page 352
MA(q) AR(p) ARMA(p,q)
Auto correlation Cut off after q Die Out Die Out
Partial Auto Correlation Die out Cut off after p Die Out
doc_919334611.ppt