2y ago

34 Views

2 Downloads

446.11 KB

56 Pages

Transcription

Chapter 12The FORECAST ProcedureChapter Table of ContentsOVERVIEW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 579GETTING STARTED . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 581Introduction to Forecasting Methods . . . . . . . . . . . . . . . . . . . . . . 589SYNTAX . . . . . . . . . . . .Functional Summary . . . . .PROC FORECAST StatementBY Statement . . . . . . . . .ID Statement . . . . . . . . .VAR Statement . . . . . . . .594594595600600600DETAILS . . . . . . . . . . . . . . .Missing Values . . . . . . . . . . .Data Periodicity and Time IntervalsForecasting Methods . . . . . . . .Specifying Seasonality . . . . . . .Data Requirements . . . . . . . . .OUT Data Set . . . . . . . . . . .OUTEST Data Set . . . . . . . . .601601601602611613613615EXAMPLES . . . . . . . . . . . . . . . . .Example 12.1 Forecasting Auto Sales . . .Example 12.2 Forecasting Retail Sales . . .Example 12.3 Forecasting Petroleum Sales.619619623627REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 630577

Part 2. General InformationSAS OnlineDoc : Version 8578

Chapter 12The FORECAST ProcedureOverviewThe FORECAST procedure provides a quick and automatic way to generate forecastsfor many time series in one step. The procedure can forecast hundreds of series at atime, with the series organized into separate variables or across BY groups. PROCFORECAST uses extrapolative forecasting methods where the forecasts for a seriesare functions only of time and past values of the series, not of other variables.You can use the following forecasting methods. For each of these methods, you canspecify linear, quadratic, or no trend. The stepwise autoregressive method is used by default. This method combinestime trend regression with an autoregressive model and uses a stepwise methodto select the lags to use for the autoregressive process.The exponential smoothing method produces a time trend forecast, but in fittingthe trend, the parameters are allowed to change gradually over time, and earlierobservations are given exponentially declining weights. Single, double, andtriple exponential smoothing are supported, depending on whether no trend,linear trend, or quadratic trend is specified. Holt two-parameter linear exponential smoothing is supported as a special case of the Holt-Winters methodwithout seasons.The Winters method (also called Holt-Winters) combines a time trend withmultiplicative seasonal factors to account for regular seasonal fluctuations ina series. Like the exponential smoothing method, the Winters method allowsthe parameters to change gradually over time, with earlier observations givenexponentially declining weights. You can also specify the additive version ofthe Winters method, which uses additive instead of multiplicative seasonal factors. When seasonal factors are omitted, the Winters method reduces to theHolt two-parameter version of double exponential smoothing.The FORECAST procedure writes the forecasts and confidence limits to an outputdata set, and can write parameter estimates and fit statistics to an output data set. TheFORECAST procedure does not produce printed output.PROC FORECAST is an extrapolation procedure useful for producing practical results efficiently. However, in the interest of speed, PROC FORECAST uses someshortcuts that cause some statistical results (such as confidence limits) to be onlyapproximate. For many time series, the FORECAST procedure, with appropriatelychosen methods and weights, can yield satisfactory results. Other SAS/ETS procedures can produce better forecasts but at greater computational expense.579

Part 2. General InformationYou can perform the stepwise autoregressive forecasting method with the AUTOREGprocedure. You can perform exponential smoothing with statistically optimal weightsas an ARIMA model using the ARIMA procedure. Seasonal ARIMA models canbe used for forecasting seasonal series for which the Winters and additive Wintersmethods might be used.Additionally, the Time Series Forecasting System can be used to develop forecastingmodels, estimate the model parameters, evaluate the models’ ability to forecast anddisplay the results graphically. See Chapter 23, “Getting Started with Time SeriesForecasting,” for more details.SAS OnlineDoc : Version 8580

Chapter 12. Getting StartedGetting StartedTo use PROC FORECAST, specify the input and output data sets and the numberof periods to forecast in the PROC FORECAST statement, then list the variables toforecast in a VAR statement.For example, suppose you have monthly data on the sales of some product, in a dataset, named PAST, as shown in Figure 12.1, and you want to forecast sales for the next10 months.Figure 546Example Data Set PASTThe following statements forecast 10 observations for the variable SALES using thedefault STEPAR method and write the results to the output data set PRED:proc forecast data past lead 10 out pred;var sales;run;The following statements use the PRINT procedure to print the data set PRED:proc print data pred;run;The PROC PRINT listing of the forecast data set PRED is shown in Figure 12.2.581SAS OnlineDoc : Version 8

Part 2. General InformationObs12345678910Figure 413.410513.535113.659613.7840Forecast Data Set PREDGiving Dates to Forecast ValuesNormally, your input data set has an ID variable that gives dates to the observations,and you want the forecast observations to have dates also. Usually, the ID variable hasSAS date values. (See Chapter 2, “Working with Time Series Data,” for informationon using SAS date values.) The ID statement specifies the identifying variable.If the ID variable contains SAS date values, the INTERVAL option should be usedon the PROC FORECAST statement to specify the time interval between observations. (See Chapter 3, “Date Intervals, Formats, and Functions,” for more information on time intervals.) The FORECAST procedure uses the INTERVAL option togenerate correct dates for forecast observations.The data set PAST, shown in Figure 12.1, has monthly observations and contains anID variable DATE with SAS date values identifying each observation. The followingstatements produce the same forecast as the preceding example and also include theID variable DATE in the output data set. Monthly SAS date values are extrapolatedfor the forecast observations.proc forecast data past interval month lead 10 out pred;var sales;id date;run;Computing Confidence LimitsDepending on the output options specified, multiple observations are written to theOUT data set for each time period. The different parts of the results are containedin the VAR statement variables in observations identified by the character variable– TYPE– and by the ID variable.For example, the following statements use the OUTLIMIT option to write forecastsand 95% confidence limits for the variable SALES to the output data set PRED. Thisdata set is printed with the PRINT procedure.proc forecast data past interval month lead 10out pred outlimit;var sales;id date;run;SAS OnlineDoc : Version 8582

Chapter 12. Getting Startedproc print data pred;run;The output data set PRED is shown in Figure Y92MAY92MAY92Figure 1Output Data SetForm of the OUT Data SetThe OUT data set PRED, shown in Figure 12.3, contains three observations for eachof the 10 forecast periods. Each of these three observations has the same value of theID variable DATE, the SAS date value for the month and year of the forecast.The three observations for each forecast period have different values of the variable – TYPE– . For the – TYPE– FORECAST observation, the value of the variableSALES is the forecast value for the period indicated by the DATE value. For the– TYPE– L95 observation, the value of the variable SALES is the lower limit ofthe 95% confidence interval for the forecast. For the – TYPE– U95 observation, thevalue of the variable SALES is the upper limit of the 95% confidence interval.You can control the types of observations written to the OUT data set with thePROC FORECAST statement options OUTLIMIT, OUTRESID, OUTACTUAL,OUT1STEP, OUTSTD, OUTFULL, and OUTALL. For example, the OUTFULL option outputs the confidence limit values, the one-step-ahead predictions, and the actual data, in addition to the forecast values. See the sections "Syntax" and "OUT Data Set" later in this chapter for more information.583SAS OnlineDoc : Version 8

Part 2. General InformationPlotting ForecastsThe forecasts, confidence limits, and actual values can be plotted on the same graphwith the GPLOT procedure. Use the appropriate output control options on the PROCFORECAST statement to include in the OUT data set the series you want to plot.Use the – TYPE– variable in the GPLOT procedure PLOT statement to separate theobservations for the different plots.In this example, the OUTFULL option is used, and the resulting output data set contains the actual and predicted values, as well as the upper and lower 95proc forecast data past interval month lead 10out pred outfull;id date;var sales;run;proc gplot data pred;plot sales * date type /haxis ’1jan90’d to ’1jan93’d by qtrhref ’15jul91’d;symbol1 i nonev star; /* for type ACTUAL */symbol2 i spline v circle;/* for type FORECAST */symbol3 i spline l 3;/* for type L95 */symbol4 i spline l 3;/* for type U95 */where date ’1jan90’d;run;The – TYPE– variable is used in the GPLOT procedure’s PLOT statement to makeseparate plots over time for each type of value. A reference line marks the start ofthe forecast period. (Refer to SAS/GRAPH Software: Reference, Volume 2, Version 7,First Edition for more information on using PROC GPLOT.) The WHERE statementrestricts the range of the actual data shown in the plot. In this example, the variableSALES has monthly data from July 1989 through July 1991, but only the data for1990 and 1991 are shown in the plot.The plot is shown in Figure 12.4.SAS OnlineDoc : Version 8584

Chapter 12. Getting StartedFigure 12.4.Plot of Forecast with Confidence LimitsPlotting ResidualsYou can plot the residuals from the forecasting model using PROC GPLOT and aWHERE statement.1. Use the OUTRESID option or the OUTALL option in the PROC FORECASTstatement to include the residuals in the output data set.2. Use a WHERE statement to specify the observation type of ’RESIDUAL’ inthe PROC GPLOT code.The following example adds the OUTRESID option to the preceding example andplots the residuals:proc forecast data past interval month lead 10out pred outfull outresid;id date;var sales;run;proc gplot data pred;where type ’RESIDUAL’;plot sales * date /haxis ’1jan89’d to ’1oct91’d by qtr;symbol1 i circle;run;The plot of residuals is shown in Figure 12.5.585SAS OnlineDoc : Version 8

Part 2. General InformationFigure 12.5.Plot of ResidualsModel Parameters and Goodness-of-Fit StatisticsYou can write the parameters of the forecasting models used, as well as statistics measuring how well the forecasting models fit the data, to an output SAS data set usingthe OUTEST option. The options OUTFITSTATS, OUTESTTHEIL, and OUTESTALL control what goodness-of-fit statistics are added to the OUTEST data set.For example, the following statements add the OUTEST and OUTFITSTATS options to the previous example to create the output statistics data set EST for the resultsof the default stepwise autoregressive forecasting method:proc forecast data past interval month lead 10out pred outfull outresidoutest est outfitstats;id date;var sales;run;proc print data est;run;The PRINT procedure prints the OUTEST data set, as shown in Figure 12.6.SAS OnlineDoc : Version 8586

Chapter 12. Getting 5262728293031323334Figure INPERSQUAREADJRSQRW 59-74.028970.9791313The OUTEST Data Set for STEPAR MethodIn the OUTEST data set, the DATE variable contains the ID value of the last observation in the data set used to fit the forecasting model. The variable SALES contains the statistic indicated by the value of the – TYPE– variable. The – TYPE– N,NRESID, and DF observations contain, respectively, the number of observations readfrom the data set, the number of nonmissing residuals used to compute the goodnessof-fit statistics, and the number of nonmissing observations minus the number ofparameters used in the forecasting model.The observation having – TYPE– SIGMA contains the estimate of the standarddeviation of the one-step prediction error computed from the residuals. The– TYPE– CONSTANT and – TYPE– LINEAR contain the coefficients of the timetrend regression. The – TYPE– AR1, AR2, ., AR8 observations contain the estimated autoregressive parameters. A missing autoregressive parameter indicates thatthe autoregressive term at that lag was not included in the model by the stepwisemodel selection method. (See the section "STEPAR Method" later in this chapter formore information.)The other observations in the OUTEST data set contain various goodness-of-fitstatistics that measure how well the forecasting model used fits the given data. See"OUTEST Data Set" later in this chapter for details.587SAS OnlineDoc : Version 8

Part 2. General InformationControlling the Forecasting MethodThe METHOD option controls which forecasting method is used. The TREND option controls the degree of the time trend model used. For example, the following statements produce forecasts of SALES as in the preceding example but use thedouble exponential smoothing method instead of the default STEPAR method:proc forecast data past interval month lead 10method expo trend 2out pred outfull outresidoutest est outfitstats;var sales;id date;run;proc print data est;run;The PRINT procedure prints the OUTEST data set for the EXPO method, as shownin Figure 6272829Figure RSQRW 8160.9772418The OUTEST Data Set for METHOD EXPOSee the "Syntax" section later in this chapter for other options that control the forecasting method. See "Introduction to Forecasting Methods" and "Forecasting Methods" later in this chapter for an explanation of the different forecasting methods.SAS OnlineDoc : Version 8588

Chapter 12. Getting StartedIntroduction to Forecasting MethodsThis section briefly introduces the forecasting methods used by the FORECAST procedure. Refer to textbooks on forecasting and see "Forecasting Methods" later in thischapter for more detailed discussions of forecasting methods.The FORECAST procedure combines three basic models to fit time series: time trend models for long-term, deterministic changeautoregressive models for short-term fluctuationsseasonal models for regular seasonal fluctuationsTwo approaches to time series modeling and forecasting are time trend models andtime series methods.Time Trend ModelsTime trend models assume that there is some permanent deterministic pattern acrosstime. These models are best suited to data that are not dominated by random fluctuations.Examining a graphical plot of the time series you want to forecast is often very usefulin choosing an appropriate model. The simplest case of a time trend model is onein which you assume the series is a constant plus purely random fluctuations that areindependent from one time period to the next. Figure 12.8 shows how such a timeseries might look.Figure 12.8.Time Series without Trend589SAS OnlineDoc : Version 8

Part 2. General InformationThe x t values are generated according to the equationxt b0 twhere t is an independent, zero-mean, random error, and b0 is the true series mean.Suppose that the series exhibits growth over time, as shown in Figure 12.9.Figure 12.9.Time Series with Linear TrendA linear model is appropriate for this data. For the linear model, assume the xt valuesare generated according to the equationxt b0 b1t tThe linear model has two parameters. The predicted values for the future are thepoints on the estimated line. The extension of the polynomial model to three parameters is the quadratic (which forms a parabola). This allows for a constantly changingslope, where the xt values are generated according to the equationxt b0 b1t b2t2 tPROC FORECAST can fit three types of time trend models: constant, linear, andquadratic. For other kinds of trend models, other SAS procedures can be used.Exponential smoothing fits a time trend model using a smoothing scheme in whichthe weights decline geometrically as you go backward in time. The forecasts fromSAS OnlineDoc : Version 8590

Chapter 12. Getting Startedexponential smoothing are a time trend, but the trend is based mostly on the recent observations instead of on all the observations equally. How well exponential smoothingworks as a forecasting method depends on choosing a good smoothing weight for theseries.To specify the exponential smoothing method, use the METHOD EXPO option. Single exponential smoothing produces forecasts with a constant trend (that is, no trend).Double exponential smoothing produces forecasts with a linear trend, and triple exponential smoothing produces a quadratic trend. Use the TREND option with theMETHOD EXPO option to select single, double, or triple exponential smoothing.The time trend model can be modified to account for regular seasonal fluctuations ofthe series about the trend. To capture seasonality, the trend model includes a seasonalparameter for each season. Seasonal models can be additive or multiplicative.xt b0 b1t s(t) t(Additive)xt (b0 b1t)s(t) t(Multiplicative)where s(t) is the seasonal parameter for the season corresponding to time t.The Winters method is similar to exponential smoothing, but includes seasonal factors. The Winters method can use either additive or multiplicative seasonal factors.Like exponential smoothing, good results with the Winters method depend on choosing good smoothing weights for the series to be forecast.To specify the multiplicative or additive versions of the Winters method, use theMETHOD WINTERS or METHOD ADDWINTERS options, respectively. Tospecify seasonal factors to include in the model, use the SEASONS option.Many observed time series do not behave like constant, linear, or quadratic timetrends. However, you can partially compensate for the inadequacies of the trend models by fitting time series models to the departures from the time trend, as described inthe following sections.Time Series MethodsTime series models assume the future value of a variable to be a linear function ofpast values. If the model is a function of past values for a finite number of periods, itis an autoregressive model and is written as follows:xt a0 a1xt,1 a2 xt,2 : : : apxt,p tThe coefficients ai are autoregressive parameters. One of the simplest cases of thismodel is the random walk, where the series dances around in purely random jumps.This is illustrated in Figure 12.10.591SAS OnlineDoc : Version 8

Part 2. General InformationFigure 12.10.Random Walk SeriesThe xt values are generated by the equationxt xt,1 tIn this type of model, the best forecast of a future value is the present value. However,with other autoregressive models, the best forecast is a weighted sum of recent values.Pure autoregressive forecasts always damp down to a constant (assuming the processis stationary).Autoregressive time series models can also be used to predict seasonal fluctuations.Combining Time Trend with Autoregressive ModelsTrend models are suitable for capturing long-term behavior, whereas autoregressivemodels are more appropriate for capturing short-term fluctuations. One approach toforecasting is to combine a deterministic time trend model with an autoregressivemodel.The stepwise autoregressive method (STEPAR method) combines a time-trend regression with an autoregressive model for departures from trend. The combinedtime-trend and autoregressive model is written as follows:xt b0 b1t b2t2 utut a1 ut,1 a2ut,2 : : : aput,p tThe autoregressive parameters included in the model for each series are selected by astepwise regression procedure, so that autoregressive parameters are only included atthose lags at which they are statistically significant.SAS OnlineDoc : Version 8592

Chapter 12. Getting StartedThe stepwise autoregressive method is fully automatic and, unlike the exponentialsmoothing and Winters methods, does not depend on choosing smoothing weights.However, the STEPAR method assumes that the long-term trend is stable; that is, thetime trend regression is fit to the whole series with equal weights for the observations.The stepwise autoregressive model is used when you specify the METHOD STEPARoption or do not specify any METHOD option. To select a constant, linear, orquadratic trend for the time-trend part of the model, use the TREND option.593SAS OnlineDoc : Version 8

Part 2. General InformationSyntaxThe following statements are used with PROC FORECAST:PROC FORECAST options;BY variables;ID variables;VAR variables;Functional SummaryThe statements and options controlling the FORECAST procedure are summarizedin the following table:DescriptionStatementStatementsspecify BY-group processingidentify observationsspecify the variables to forecastBYIDVARInput Data Set Optionsspecify the input SAS data setspecify frequency of the input time seriesspecify increment between observationsspecify seasonalityspecify number of periods in a seasontreat zeros at beginning of series as missingPROC FORECASTPROC FORECASTPROC FORECASTPROC FORECASTPROC FORECASTPROC FORECASTDATA INTERVAL INTPER SEASONS SINTPER ZEROMISSPROC FORECASTLEAD PROC FORECASTPROC FORECASTPROC FORECASTPROC FORECASTPROC FORECASTOUT OUTACTUALOUTLIMITOUTRESIDOUTSTDPROC FORECASTOUT1STEPPROC FORECASTOUTFULLPROC FORECASTPROC FORECASTPROC FORECASTOUTALLALPHA ALIGN Output Data Set Optionsspecify the number of periods ahead toforecastname output data set containing the forecastswrite actual values to the OUT data setwrite confidence limits to the OUT data setwrite residuals to the OUT data setwrite standard errors of the forecasts to theOUT data setwrite one-step-ahead predicted values to theOUT data setwrite predicted, actual, and confidence limitvalues to the OUT data setwrite all available results to the OUT data setspecify significance level for confidence limitscontrol the alignment of SAS Date valuesSAS OnlineDoc : Version 8594Option

Chapter 12. SyntaxDescriptionParameters and Statistics Output DataSet Optionswrite parameter estimates and goodness-of-fitstatistics to an output data setwrite additional statistics to OUTEST datasetwrite Theil statistics to OUTEST data setwrite forecast accuracy statistics to OUTEST data setForecasting Method Optionsspecify the forecasting methodspecify degree of the time trend modelspecify smoothing weightsspecify order of the autoregressive modelspecify significance level for adding AR lagsspecify significance level for keeping AR lagsstart forecasting before the end of dataspecify criterion for judging singularityInitializing Smoothed Valuesspecify number of beginning values to use incalculating starting valuesspecify number of beginning values to use incalculating initial seasonal parametersspecify starting values for constant termspecify starting values for linear trendspecify starting values for the quadratic trendStatementOptionPROC FORECASTOUTEST PROC FORECASTOUTESTALLPROC FORECASTPROC FORECASTOUTESTTHEILOUTFITSTATSPROC FORECASTPROC FORECASTPROC FORECASTPROC FORECASTPROC FORECASTPROC FORECASTPROC FORECASTPROC FORECASTMETHOD TREND WEIGHT AR SLENTRY SLSTAY START SINGULAR PROC FORECASTNSTART PROC FORECASTNSSTART PROC FORECASTPROC FORECASTPROC FORECASTASTART BSTART CSTART PROC FORECAST StatementPROC FORECAST options;The following options can be specified in the PROC FORECAST statement:ALIGN optioncontrols the alignment of SAS dates used to identify output observations.The ALIGN option allows the following values: BEGINNING BEG B, MIDDLE MID M, and ENDING END E. BEGINNING is the default.595SAS OnlineDoc : Version 8

Part 2. General InformationALPHA valuespecifies the significance level to use in computing the confidence limits of the forecast. The value of the ALPHA option must be between .01 and .99. You shoulduse only two digits for the ALPHA option because PROC FORECAST rounds thevalue to the nearest percent (ALPHA .101 is the same as ALPHA .10). The defaultis ALPHA .05, which produces 95% confidence limits.AR nNLAGS nspecifies the maximum order of the autoregressive model. The AR option is onlyvalid for METHOD STEPAR. The default value of n depends on the INTERVAL option and on the number of observations in the DATA data set. See "STEPARMethod" later in this chapter for details.ASTART valueASTART ( value . )specifies starting values for the constant term for the exponential smoothing, Winters,and additive Winters methods. This option is ignored if METHOD STEPAR. See"Starting Values for EXPO, WINTERS, and ADDWINTERS Methods" later in thischapter for details.BSTART valueBSTART ( value . )specifies starting values for the linear trend for the exponential smoothing, Winters, and additive Winters methods. This option is ignored if METHOD STEPARor TREND 1. See "Starting Values for EXPO, WINTERS, and ADDWINTERSMethods" later in this chapter for details.CSTART valueCSTART ( value . )specifies starting values for the quadratic trend for the exponential smoothing, Winters, and additive Winters methods. This option is ignored if METHOD STEPAR orTREND 1 or 2. See "Starting Values for EXPO, WINTERS, and ADDWINTERSMethods" later in this chapter for details.DATA SAS-data-setnames the SAS data set containing the input time series for the procedure to forecast.If the DATA option is not specified, the most recently created SAS data set is used.INTERVAL intervalspecifies the frequency of the input time series. For example, if the input data setconsists of quarterly observations, then INTERVAL QTR should be used. See Chapter 3, “Date Intervals, Formats, and Functions,” for more details on the intervals available.INTPER nwhen the INTERVAL option is not used, INTPER specifies an increment (otherthan 1) to use in generating the values of the ID variable for the forecast observationsin the output data set.SAS OnlineDoc : Version 8596

Chapter 12. SyntaxLEAD nspecifies the number of periods ahead to forecast. The default is LEAD 12.The LEAD value is relative to the last observation in the input data set and not to theend of a particular series. Thus, if a series has missing values at the end, the actualnumber of forecasts computed for that series will be greater than the LEAD value.METHOD method-namespecifies the method to use to model the series and generate the forecasts.METHOD STEPAR specifies the stepwise autoregressive method.METHOD EXPO specifies the exponential smoothing method.METHOD WINTERS specifies the Holt-Winters exponentially smoothed trendseasonal method.METHOD ADDWINTERS specifies the additive seasonal factors variant of theWinters method.For more information, see the section "Forecasting Methods" later in this chapter.The default is METHOD STEPAR.NSTART nNSTART MAXspecifies the number of beginning values of the series to use in calculating startingvalues for the trend parameters in the exponential smoothing, Winters, and additiveWinters methods. This option is ignored if METHOD STEPAR.For METHOD EXPO, n beginning values of the series are used in forming the

of periods to forecast in the PROC FORECAST statement, then list the variables to forecast in a VAR statement. For example, suppose you have monthly data on the sales of some product, in a data set, named PAST, as shown in Figure 12.1, and you want to forecast sales for the next 10 months. Obs date sa

Related Documents: