An Animated Guide : Proc UCM (Unobserved Components Model)

3y ago
44 Views
2 Downloads
884.69 KB
23 Pages
Last View : 2m ago
Last Download : 3m ago
Upload by : Cannon Runnels
Transcription

NESUG 17AnalysisAn Animated Guide : Proc UCM (Unobserved Components Model)Russ Lavery, Contractor for ASG, Inc.ABSTRACTThis paper explores the underlying model and several of the features of Proc UCM, new in the Econometrics andTime Series (ETS) module of SAS . This procedure can be used by programmers in many fields, not justEconometrics. Time series data is generated by marketers as they monitor “sales by month” and by medicalresearchers who collect vital sign information over time. This technique is well suited to modeling the effect ofinterventions (drug administration or a change in a marketing plan). This new procedure combines the flexibilityof Proc ARIMA with the ease of use and interpretability of Smoothing models. UCM does not have the capabilityto easily model transfer functions, a useful ARIMA function that is planned for Proc UCM.INTRODUCTIONThis paper explains the underlying model several of the features of Proc UCM, new in the Econometrics and TimeSeries (ETS) module.This procedure can be used by programmers in many fields, not just Econometrics. Time series data isgenerated by marketers as they monitor “sales by month” and by medical researchers who collect vital signinformation over time. This technique is well suited to modeling the effect of interventions (drug administration ora change in a marketing plan). This new procedure combines the flexibility of Proc ARIMA with the ease of useand interpretability of Smoothing models.THE MODEL NEW DEFINITIONSOne thing that makes UCM useful is its similarity to regression. A useful conceptual framework for UCM is that ofa regression model (Y B0 B1X1 B2X2 ε ) where the betas are allowed to be time varying. A major differencebetween data properly modeled with regression and data typically modeled by time series techniques is thepresence of auotocorrelation, or serial correlation. In “time series data” observations close together tend tobehave similarly. If observation number n is above a fitted regression line, it is likely that observations N-1 andN 1 will also be above the regression line. This pattern of correlation between observations (and errors) breaksdown as observations get farther apart in time. These characteristics suggest that a model for the data shouldplace more “weight” or “importance” on “recent” observations and not give all observations in the data set equalimportance. Proc ARIMA, and Proc UCM, both create models that are “local”, that is they attribute moreimportance to “close” observations.The model for UCM is:Yt Yt µttrend γt Season ψtCycle rt Σ φi Yt-1 Autoregressive term A regressive terms involving lagged dep. Variables Σ βj Xjt A regressive term on indep. vars.The model components µt, , γt , ψtunderlying “drivers” of the time series.and εt error termrt are assumed to be independent of each other and modelPage 1 of 23

NESUG 17AnalysisYtµtγtDependent VariableTrend is implementedthrough the combination oflevel and slope statements,and their options.A UCM with just a levelstatement, models a timeseries with 0 slope. A UCMwith just a slope statementgives an error.Season is implementedthrough the season statementand it’s options.Trend is the natural tendency of a series in the absence of seasonality,cycles or the effect of any independent variables. In UCM, this is a meanand a slope, so it corresponds to B0 and B1 in regression. Trend is modeledin two ways and it’s relationship to B0 and B1 can be seen below.One method is a random walk µt µt –1 ή (where ή is an IID errorterm).The second method is a locally linear trend with a slope that varies, only,with time.µt µt –1 βt –1 ήt(where ή is i.i.d N(0, σ2 ή IID error term).As beta goes forward, it can vary with time asβt –1 βt –1 ζt(where ζ is is i.i.d N(0, σ2ζ IID error term).Season is the effect of seasonal effects and does not imply a yearly period tothe season. The main characteristic of seasonally is that it’s period (the timeit takes to get through one full cycle) is known. The effects of seasonalitysum to zero over the cycle. Seasonality is modeled in two ways.One method is a dummy variable methodΣγ ωt(where ωt is i.i.d N(0, σ2ω IID error term).The second method is a Uses a trigonometric form and seasonality is thesum of different cycles.ψtCycle is importantrtAutoregressive termΣ φi Yt-1A regressive terms involvinglagged dep. VariablesA regressive term on indep.vars.Σ βj XjtεtIrregular term or error termProc UCM allows blocking of cycles, or specifying cycles within cycles.The need for this can occurr in many instances. One example is admissionsat an Emergency Room. There is a weekly cycle, where Mondayadmissions are low and Saturday admissions are high. There is also a dailycycle that starts slow in early AM and has early PM and evening peaks.These cycles of admission nest and produce very high admissions onSaturday evening.Cycles are like seasons, but with an unknown period. They are not oftenused in their “pure form”, but are employed as building blocks. Cycleeffects are similar to seasonal effects but the period is not known anddetermined from the data. A periodic pattern, no matter how complex, canbe expressed as a sum of cycles. UCM has implemented cycles as havingfixed periods but time varying amplitude and phase.UCM considers an autoregressive term as a cycle where frequency is either0 or π.The expression for UCM autoregression is:rt ρ rt-1 υt (where υ is i.i.d N(0, σ2 υ IID error term).These two terms allow the programmer/statistician to great flexibility indescribing the process under study.Σ βj Xjt allows the determination of effects of outside intervention andsupport dummy variable and continuous variable coding. They can be usedto model the effect of investigator interventions like drug administration or achange in a marketing plan.εt is i.i.d N(0, σ2ε IID error term).The programmer/statistician can create a great many types of time series by adding and deleting componentsfrom the model as well as changing options associated with statements in the model. Some knowledge of this isrequired because the determination of the best model will involve a process that is similar to the stepwise removalprocess in regression. While a parsimonious model is the goal of any modeling project, there is no generalagreement in the literature on how this is best to be done. This paper, not in conflict with the literature butperhaps foolishly, makes an attempt to simplify a model.UCM output parameters are different from regression and this has impact on how UCM is used. Proc UCM canPage 2 of 23

NESUG 17Analysisinterpolate missing/new values of Y within the time span of the estimating data set. It can also forecast futurevalues of Y. UCM produces two tables that show components of the model and their associated P values.Below, please find some rules on interpreting these P values and how the interpretation can be used to purify themodel.1) variances of the disturbance terms of the unobserved components-if not significant, the term is not time varying and shouldbe made deterministic2) Dampening coefficients and Frequency of cycles-if not significant, the term is not contributing to themodel and should be removed3) Dampening coefficient of autoregression terms- If not significant, the term is not contributing to themodel and should be removed4) Regression coefficient of Regression terms-if not significant, the term is not contributing to themodel and should be removedUCM allows the programmer/statistician to set the above parameters to a specific value. This is importantbecause the stepwise process of improving the model involves: 1) removing statements from the model toremove an underlying process from the model and/or 2) setting variance parameters to zero to change theassociated underlying process from time varying to fixed.As an example of changing the form of the model by setting parameters to be fixed at zero, examine the submodels below that are associated with trend. Trend, we should remember, is only one component of the model.The common conditions of the UCM, that of a locally linear trend is implied in the two equations below.(where ή is i.i.d N(0, σ2 ή IID error term)µt µt -1 βt –1 ήtInterpret this as the mean of the current period last period’s mean, effect of 1 period of time a random termβt –1 βt –1 ζt(where ζ is i.i.d N(0, σ2ζ IID error term).Interpret this as the slope changes randomly. The change is the effect of “time” and not an independent variable.If σ ζ 2020If σ ή If both σ2ζ 0 and σ2ή 0Bt Bt-1 or B is a constant. This transforms the above equations to just one.µt µt -1 β ήt .This is called a linear trend with fixed slope modelThis transforms the above equations to:µt µt -1 βt –1 0 -- like the model above but with one less error term.βt –1 βt –1 ζt(where ζ is i.i.d N(0, σ2ζ IID error term).Which often produces a smoother trend than the original two equation UCMmodel.B is a constant and there is no error term in the trend component.The trend is no longer random and is modeled as: µt µo βtPROJECT1To demonstrate Proc UCM, a dataset was created by data step programming. Components of the UCM modelPage 3 of 23

NESUG 17Analysiswere calculated individually in a datastep and summed to get the total sales, shown below. The program isattached to the article. The task was to use Proc UCM to model this dataset.To the right is a plot of total sales for thehypothetical company. The company makes avery high tech kind of eyeglasses. Theseglasses are appropriate for High altitude hiking,where UV is strong and where there islikelihood of reflection off the ground. Theimaginary glasses are sold to high altitudehikers and to construction workers, and others,who work in areas of high ground/waterreflectivity.There was a yearlong recession from month 12to 24 (see vertical lines) and there was adifferent effect on the retail vs commercialsales. Both were affected, but retail sales wereaffected more due to a large reduction in travelvacations (to high altitude spots). Commercialwas affected, but not as much.During the recession, the company decided toexpand internationally and started shipping tothe southern hemisphere (the company callsthis volume “Sales to Antarctica” thoughshipments go to New Zealand a, Peru andother destinations). Since the winter/summerseasons are reversed, the seasonality of totalsales was changed. International sales startedthe same month that the recession ended.The figure to the right shows how the threecycles (retail construction/commercial andAntarctic) are out of phase and how they addto the total cycle component for the data set.OUT HYPOTHETICAL MANAGEMENT ISSUE IS TODETERMINE THE EFFECT OF THE RECESSION AND OFTHE STRATEGIC DECISION TO “GO INTERNATIONAL”.As American sales started to tail off theinternational sales (New Zealand, Australia,Peru) started up and not only drove the totalsales up, but extended the selling “season”.Page 4 of 23

NESUG 17AnalysisThis chart shows the trends of the componentsof the model.The blue line shows the calculated trend in theinternational sales. There were nointernational sales until the recession forcedthe company to a strategic change.The red line shows that the recession maderetail sales flat during the recession.The green line, the commercial trend, showsthat the people who used the glasses at work,continued to buy through the recession, thoughat a lower rate.To the right is a plot of the summed cyclecomponents that were calculated by data stepprogramming.The business challenge was to remove thisnoise and to recover the effect of the recessionand of going international.The code submitted, matched by some explanation of the commands, is shown below:Proc UCM data for ucmPrintall turns on all printing options for the procedure.PRINTALL ;ID specifies a variable to be used as an identifier.id idmonth interval month;The model statement says Y is total sales and is to beexplained by a time series and two independent variables.model tot sales Rcsn dv Int DV;Irregular instructs SAS to include the error term (irregularterm) εt in the model.irregular ;level ;Level and slope, with no options, combine to tell Proc UCMslope;to model with time varying slope and mean.cycle;Cycle says include a cyclical component. This behavior iscomplex.The season statement and options instruct SAS to look forSEASON LENGTH 12a 12-month cycle and not to use the dummy variableTYPE TRIG ;coding. The observed cyclical behavior seems to be tooirregular for dummy variable coding.deplag lags 1;Deplags 1 instructs SAS to include φi Yt-1 in the model. φIwill be estimated from the model.estimateThe estimate command tells SAS to estimate parametersOUTEST UCM ESTIMATES;and put them in a file called UCM ESTIMATES.The forecast command tells SAS to forecast values for 6forecast lead 6periods, to calculate the components of the model and putprint decompOUTFOR UCM FORECASTS ; run;them in a file called UCM FORECASTSPage 5 of 23

NESUG 17AnalysisThe procedure outputs several sub-tables that will be described and not included in this paperFirst, UCM prints some summary statistics on the data that was used for creation of the model. This includesmin, max, mean, date of first obs. and date of last obs. This information is useful in data checking and is avaluable QC feature.Second, UCM prints some summary statistics on the data that was used for estimation. This includes min, max,mean, date of first obs. and date of last obs. One method for forecasting future observations is to put them in theinput data set with no Y values. Additionally, programmers/statistician can check to see how well the model isperforming by using options to tell SAS to not use the last n observations in the data set (which do have Y values)in creating the model. This allows the model to forecast these time periods and the programmer/statistician tocheck forecasted vs actual values. This information is useful in model checking.Included tables are:The two tables that must be examined as part of the model selection (stepwise process) are shown below. Theyare the “Final Estimates of the Free Parameters“ table and the “Significance Analysis of Components(Based on the Final State) “ tables.The parameters of the models, as reported above, are a mixture of variance components and regression-likeparameter estimates (Rcsn dv, Int DV and DepLag). Variance parameters show up in both tables. Regressionlike parameters only show up in one. These different types of parameters have different uses. While theliterature has not shown agreement on a procedure for creating a parsimonious model, stepwise logic has notbeen judged incorrect and can produce a model that predicts and is interesting.For Regression-like parameters: If there are insignificant p-values, the variables should be eliminated, onevariable at a time, in a stepwise fashion. The worst performing variable should be eliminated first.For variance type parameters: Creating a parsimonious model involves two steps. The first step is to decide ifthe component of the model is time varying. The second step is to determine if is contributing. The startingassumption is that the components (Irregular, slope, cycle and season) are both time varying and significant.This model shows indications that these assumptions are not true (see bold below).A component can be significant but not time varying. This means that the non-stochastic part of the componentcould be left in the model, but as deterministic contributor to Y (like parameters in a regression). A component ofthe model can not be time varying and “not significant”. A rough outline of the process for making the timevarying parts of the model parsimonious is:This is not a randomeffect, but this parametermight have a deterministiceffect.Is thecomponentsignificant?NoYesNoIs the varianceestimate asignificant effect?YesLooks like a randomeffect might exist for thiscomponent.Is thecomponentestimate asignificanteffect?The output from our first model is:Page 6 of 23NoYesRemove from Model.TWO strikes and you’reOUT!Keep in the model asdeterministic!Set Variance 0 andNoEst optionsVery complex area andbeyond the scope of thispaper. Theoreticalknowledge, from otherresearch, often plays alarge part here.

NESUG 17AnalysisFinal Estimates of the Free ParametersComponentParameterIrregularLevelError VarianceError VarianceSlopeSeasonCycleCycleCycleError VarianceError VarianceDamping FactorPeriodError VarianceEstimateApproxt ValueStd Error0.00001323 8.60115E-66.50277E-12 1.540.00ApproxPr t 0.12410.99933.424.2223.077.471.150.0006 .0001 .0001 .00010.2518Rcsn dvCoefficient3.551720.0401888.40 .0001Int DVCoefficient5.674810.0637389.04 .0001DEPLAGPHI 10.340530.0284811.96 .0001Significance Analysis of Components (Based on the Final State)Component DF Chi-Square Pr 13948Full Model0.986837311.74073.99 .0001 .00010.6241 .0001Step two would be to remove level (modify the model for the highest P value in the Free parameter table) as atime varying component of the model by adding the options Variance 0 and NoEst to the level statement. Thisoption tells Proc UCM to start the model with a variance estimate equal to zero, and not to attempt to estimate abetter value (fix the value at zero). The code for step two is below.Proc UCM data for ucmPrintall turns on all printing options for the procedure.PRINTALL ;ID specifies a variable to be used as an identifier.id idmonthinterval month;model tot sales Rcsn dv Int DV;irregular ;level variance 0 Noest ;slope;cycle;SEASON LENGTH 12TYPE TRIG ;Deplag lags 1;estimate OUTEST UCM ESTIMATES;forecast lead 6print decompThe model statement says Y is total sales and is to beexplained by a time series and two independent variablesIrregular instructs SAS to include the error term (irregularterm) εt in the model.The variance of this time varying component, level, hasbeen assigned a staring estimate of at 0 with the NOESToption - to make level NOT time variant. NoEst tells SASnot to try to estimate the variance from the data.Slope can still be time varying.Cycle says include a cyclical component.The season statement and options instruct SAS to look for a12-month cycle and not to use the dummy variable coding.The observed cyclical behavior seems to be too irregular fordummy variable coding.Deplags 1 instructs SAS to include φi Yt-1 in the model. φIwill be estimated from the model.The estimate command tells SAS to estimate parametersand put them in a file called UCM ESTIMATES.The forecast command tells SAS to forecast values for 6periods, to calculate the components of the model and putthem in a file called UCM FORECASTS.OUTFOR UCM FORECASTS ;run;The SAS output is below:Page 7 of 23

NESUG 17AnalysisFinal Estimates of the Free ycleParameterEstimateError Variance 0.00001323Error Variance 0.00000483Error Variance 0.00001169Damping Factor 0.93881Period16.60643Error Variance7.006937E-7ApproxApproxStd Error t ValuePr t .22 .00010.0407023.07 .00012.228617.45 .00016.11598E-71.150.2519Rcsn dvCoefficient3.551720.0401888.40 .0001Int DVCoefficient5.674810.0637389.04 .0001DEPLAGPHI 10.340530.0284811.96 .0001Significance Analysis of Components (Based on the Final State)Component DF Chi-Square Pr ChiSqModel With Level variance 0Irregular 10.000.9868Level137311.7 .0001Slope14073.99 .0001Cycle20.940.6241Season11113948 .0001Step three would be to remove irregular (modify the model for the highest P value in the Free parameter table) asa time varying component of the model by adding the options Variance 0 and NoEst (variance) to the cyclestatement. It does not seem to be significant but, alternatively, it might be mis-specified.The logic is that a mis-specified variable ca

The model components . Season is the effect of seasonal effects and does not imply a yearly period to . The second method is a Uses a trigonometric form and seasonality is the sum of different cycles. Proc UCM allows blocking of cycles, or specifying cycles within cycles.

Related Documents:

proc gplot, proc sgplot, proc sgscatter, proc sgpanel, . In SAS/Graph: proc gcontour, proc gchart, proc g3d, proc gmap, Stat 342 Notes. Week 12 Page 26 / 58. KDE stands for Kernel Density Estimation. It's used to make a smooth estimation of the probability density of a distribution from the points in a data set.

2. proc sql statement 1 ; 3. proc sql statement 2 ; 4. quit; /* required */ Lines 2, 3: all of the examples in the e-Guide are expected to be embedded between lines 1 and 4. SAS accepts one or more PROC SQL statements within each PROC SQL block of code. Eight common benefits for using PROC SQL:

To search for animated graphics, use Google images, but look for ones that move or are animated. (These will usually be .gif files.) (There is a link on our website online for additional graphics. You will need to right click and save the picture to your picture folder.) Animated Sea Graphics Insert the Animated Graphics:

The target architecture for this transition includes new, dedicated UCM Cloud components deployed in Cisco's cloud as depicted in . Figure 3. Figure 3. After: Cisco UCM Cloud Calling. Figure 3 also shows an SFTP server on the customer's network, accessible to both the on-premises Unified CM and UCM Cloud clusters. This SFTP server can be .

Animated transitions are used to convey state changes and engage viewers. We focus on animated transitions between statistical graphics, with the goal of accurately conveying changes, directing attention, and helping viewers stay oriented. 2.1 Animated Transitions Animation is a common method for conveying changes between visual-ization states.

4 Carpenter's Complete Guide to the SAS REPORT Procedure 1.1 Basic Syntax Like most procedures, PROC REPORT can be executed with a minimal understanding of even the most basic syntax. In its simplest form, PROC REPORT is similar to PROC PRINT in that it creates a data listing. Here is the minimum coding required: PROC REPORT; run;

brew cone, assy s.s. (optional) cover, top valve, dump valve left 120vac 10w kit, fitting sprayhead kynar valve, dump right 120v 12w bracket, ucm kit, ucm & overlay d1000gt label, dual ucm panel cover, front d1000gt faucet, hot water w/jamnut seat cup, silicone (use on wc-1809) heat sink, 1ph assembly sprayhead, purple advance flow capacitor, x2

ANSI A300 (Part 1)-2001 Pruning Glossary of Terms . I. Executive Summary Trees within Macon State College grounds were inventoried to assist in managing tree health and safety. 500 trees or tree groupings were identified of 40 different species. Trees inventoried were 6 inches at DBH or greater. The attributes that were collected include tree Latitude and Longitude, and a visual assessment of .