1/10/2005
Data Input
The data input dialog box requests the name of the column containing the time series data and information about how it was sampled:
Data: numeric column containing n equally spaced numeric observations. Sampling Interval: defines the interval between successive observations. For example, the baseball data were collected once every year, beginning in 1901. Seasonality: the length of seasonality s, if any. The data is seasonal if there is a pattern that repeats at a fixed period. For example, monthly data typically have a seasonality of s = 12. Hourly data that repeat every day have a seasonality of s = 24. If no entry is made, the data is assumed to be nonseasonal (s = 1). Trading Days Adjustment: a numeric variable with n observations used to normalize the original observations, such as the number of working days in a month. The observations in the Data column will be divided by these values before being plotted or analyzed. There must be enough entries in this column to cover both the observed data and the number of periods for which forecasts are requested. Select: subset selection. Number of Forecasts: number of periods following the end of the data for which forecasts are desired. Automatic Forecasting SnapStat - 2
Output
The output from the SnapStat consists of a single page pf graphs and numerical statistics.
SnapStat: Automatic Forecasting Data variable: Leading average RMSE=17.7 MAE=13.96 MAPE=3.81% ME=-1.077 MPE=-0.48% Period 2005 2006 2007 2008 2009 2010 Forecast 366.743 365.715 365.591 365.58 365.579 365.579 Lower 95% Limit 330.772 327.642 326.667 325.925 325.216 324.519 Upper 95% Limit 402.715 403.788 404.515 405.235 405.943 406.64
Leading average
Residual Autocorrelations
1 48
Residual Plot
Autocorrelations
Residual lag
Residual Periodogram
2500 2000
percentage
Ordinate
-12
28
48
frequency
Residual
Analysis Options
Display: if desired, the plot may be limited to the specified number of most recent observations. Transformation: the transformation to be applied to the data, if any. If Box-Cox is selected, the program will automatically determine an appropriate power transformation to normalize the data, after adding the specified Addend to each data value. Note: the Box-Cox option can be very time-consuming if many models are being compared, since the program will fit every model at each iteration of the Box-Cox optimization algorithm. Automatic Forecasting SnapStat - 5
SnapStat Defaults
The defaults used by the Automatic Forecasting SnapStat are set on the Forecasting tab of the Preferences dialog box under the Edit menu:
Models Included: specify the models that should be fit to the data. These are the models from which the best model will be selected. Descriptions of each of the models are given in the Forecasting documentation. For several of the models, additional options are provided: Random walk model check include constant to consider a model containing a constant as well as one without a constant. Moving average model select the maximum span to consider. Models will be fit of spans 2 through the number indicated. ARIMA AR Terms specify the maximum order p of the autoregressive terms in the model. ARIMA MA Terms specify the maximum order q of the moving average terms in the model. You may elect instead to consider only models for which q = p 1. ARIMA Differencing specify the maximum order of differencing d. Select Include constant to consider models that include a constant term when differencing is performed.
STATGRAPHICS Rev. 1/10/2005 Information Criterion: the criterion used to select the best model. Forecast Limits: percentage used for the forecast probability limits.
The procedure fits each of the models indicated and selects the model that gives the smallest value of the selected criterion. They are three criteria to choose from: Akaike Information Criterion The Akaike Information Criterion (AIC) is calculated from
AIC = 2 ln(RMSE ) + 2c n
(1)
where RMSE is the root mean squared error during the estimation period, c is the number of estimated coefficients in the fitted model, and n is the sample size used to fit the model. Notice that the AIC is a function of the variance of the model residuals, penalized by the number of estimated parameters. In general, the model will be selected that minimizes the mean squared error without using too many coefficients (relative to the amount of data available). Hannan-Quinn Criterion The Hannan Quinn Criterion (HQC) is calculated from
HQC = 2 ln(RMSE ) + 2 p ln(ln(n) ) n
(2)
This criterion uses a different penalty for the number of estimated parameters. Schwarz-Bayesian Information Criterion The Schwarz-Bayesian Information Criterion (SBIC) is calculated from
SBIC = 2 ln(RMSE) + p ln(n ) n
(3)
Again, the penalty for the number of estimated parameters is different than for the other criteria.