DIAGNOSTIC TESTING AND EVALUATION OF MAXIMUM LIKELIHOOD .

3y ago
39 Views
2 Downloads
1.53 MB
29 Pages
Last View : 19d ago
Last Download : 3m ago
Upload by : Mara Blakely
Transcription

Journalof EconometricsDIAGNOSTIC30 (1985) 415-443.North-HollandTESTING AND EVALUATIONLIKELIHOOD MODELS*OF MAXIMUMGeorge TAUCHENDuke University, Durham, NC 27706, USAThe paper developsa unified theory of maximumlikelihoodspecificationtesting based onM-estimatorsof auxiliary parameters.The theory is sufficiently general to encompass a wide classof specificationtests includingmoment-basedtests, Pearson-typegoodnessof fit tests, theinformationmatrix test, and the Cox test. The paper also presents a framework based on Frechetdifferentiationfor determiningthe effects of misspecificationon the almost sure limits of parameterestimates and specificationtest statistics.1. IntroductionThis paper develops the asymptoticdistributiontheory for a class ofspecificationtests for the non-linear maximum likelihood model. The ideas thatmotivate considerationof this class of specification tests have their origins inHausman’s(1978) paper. Hausman suggested that in general, i.e., not only forthe ML model, a useful specificationtest can be based upon the differencebetweentwo estimates of the vector of parametersof interest. This idea,however, is somewhat difficult to apply in a multivariatecontext when thelikelihoodfunctiondependsupon the parametersin a highly non-linearfashion. The difficulty lies in finding a computationallytractable form for thesecond ‘specification-robust’estimate of the parameter vector that is requiredto implementHausman’stest. White (1982) suggests a different but relatedapproach.Specifically, White derives a test that is based not upon differencebetween two estimates of the parameters of direct interest, but instead is basedupon the difference between the two natural estimates of the expected information matrix. This paper extends White’s work further by deriving the asymptotic properties of an entire class of specification tests that includes as a specialcase the informationmatrix test, and other specification tests, e.g., the Cox test[Aguirre-Torresand Gallant (1983)] and the Lagrange multipliertest [Engle*Helpful commentswere obtained from Ronald Gallant, James Heckman, James Mackinnon,RichardRobb, Robin Sickles, Donald Waldman,Dudley Wallace. Adonis Yatchew, and tworeferees. Earlier versions of this paper were presented at seminars at the University of Chicago, theUniversityof Pennsylvania,Queen’s University,the Universityof Toronto,the Triangle AreaEconometricsSeminar, and the 1984 Austin Conference on Model Selection.0304-4076/85/ 3.30 1985,Elsevier Science PublishersB.V. (North-Holland)

416G. Tauchen, Maximumlikelihood specijication tests(1982)]. The asymptotic theory developed here is sufficiently general to includein the class of allowable tests those that are based upon non-differentiableandeven discontinuousfunctions of the data and the parameter vector. In particular, the class of tests includes Pearson-typegoodness of fit tests with randomcell boundaries[Moore and Spruill (1975)].This paper also develops a framework based on Frechet differentiationforcharacterizingthe non-null behavior of these various specification tests. Withinthis framework, ‘directions’ of r&specificationare identified against which thevarious specificationtests can be expected to have maximumor minimumpower.Before describing the class of specification tests in more detail, it is helpfulto review briefly the asymptoticdistributiontheory of the quasi-maximumlikelihoodestimator (the ML estimator with an incorrect likelihood function).Assume the observed data Y,, Y,, .,Y,are mutually independentand identically distributedm X 1 random vectors with common unknowndistributionfunction G and density function g, both defined on R".Let { F(y, 0): y E R"'.8 E 8 C RP} be a family of distribution functions on R" that is the basis forthe estimation.For each fixed parameter vector B the functionF(y, 0) is aprobabilitydistributionon R" with density functiondenoted by f(y, 0).Togetherthe elements of the family of distributionfunctions{ F( y, e)}, orequivalentlythe family of density functions { f(y, e)}, comprise a probabilitymodel for the observed data. The quasi-maximumlikelihood estimator 8, is thevalue of the parameter that maximizes the sample quasi-loglikelihoodfunction-Ue) fCcr,e),1where 1( y, 0) log(f( y, 0)) is the log-density function. Burguette, Gallant andSouza (1982) Huber (1967) and White (1982)*have shown that under a varietyof regularity conditionsthe QML estimator 0, converges almost surely to thevalue 6 at which the expected log-density function,L(e) E[ohe)l /h@dGb),(2)achieves its maximum. Now, if the underlying model is correctly specified, thenthere exists a 0, such that the density f(y, 0,) is a version of the true densityg(y). In this case the maximizinge for L in (2) equals 0, and fi( 8, - f3,) isasymptoticallynormally distributedwith mean zero and variance-covariancematrix equal to the inverse of the informationmatrix. On the other hand, if themodel is misspecified,then of course no such 6, exists; but the maximizingt?for the expected quasi-loglikelihoodfunction still exists and fi(e,, - 8) has awell-definedasymptotic distribution.One interpretationfor 8 is that it is the

‘true’ parametervalue that is induced directly by the estimationprocedureitself.The class of specification tests considered in this paper consists of those testsbased on the magnitude of the statistic(3)where O,, is the QML estimator/and where the vector-valuedfunctionc(y,B)df'(y,fl) O,c satisfies(4)for all 8. The condition (4) says that the function c( y, 0) has mean zero withrespect to each distributionfunction in the probabilitymodel. A function thatsatisfies this conditionwill be called an auxiliary criterion /unction. As willbecome clearer below, for any given family of distributionfunctions { F( y, f?)}there are many auxiliary criterion functions.In practice, the better auxiliarycriterion functions will be those for which the magnitude of the elements of thevector ?,, in (3) provide useful diagnostic informationabout the specification ofthe model. A strategy for getting an informative?,, is to construct the auxiliarycriterion function in such a way that the componentsof ?,, equal the differencesbetween two estimates of some statistical quantities of interest.The statistic , is useful for specification testing because it converges almostsurely to zero when the model is correctly specified and it converges to anon-zero quantity when the model is incorrectly specified. This result is provedin section 2, but it is intuitively clear from inspection of the expressions (3) and(4). In the former case when 8, exists,whichcase,is zero by constructionof the auxiliarycriterionfunction.In the latterwhich in general is non-zero. As shown in section 3, the statistic ,, also has awell defined asymptoticdistributionin either case. Its asymptoticvariancecovariancematrix can be expressed as the sum of two parts, one of whichcorrespondsto the variability in (l/n)Cyc(q, 8) about 7 and the other to thevariabilityin 8, about 8.

418G. Tuuchen, MaximumlikehhoodspecijicutionThe following three examples help to illustratethe general results in this paper:Exampleteststhe practicalapplicationsofI (low-order moments)For simplicitysional. Definein expositiontake y as scaler thoughP may be multi-dimen-for integer j. Thus ,(8,,) is the predicted jth non-centralmoment from theestimated probabilitymodel. Let j be fixed at some integer and defineThis functionis a legitimate auxiliarycondition(4). Moreover, the statisticcriterionfunctionsince it satisfiestheis simply the difference between sample jth non-centralmoment and thepredictedmoment from the probabilitymodel. A large value for ] ,J wouldtend to indicate that the probabilitymodel does a poor job of ‘matching’ thejth moment of the distributionof the data. As shown in section 5 of this paper,there is a regression-basedprocedure for testing for whether the magnitudeI?,,1is too large to b accounted for by sampling fluctuations:One regresses thevalues P, c(Y, 6,) on the scores h, al( Yi, d,)/afl and performs a t test for anon-zero intercept. If the t statistic is large from a statistical point of view themodel may need to be reformulatedor else an explanationgiven as to why thedifference in moments is too small to be of practical importance.Of course insome cases the estimationproceduremay force some of the sample andpredicted moments to be equal and no such test is possible. For instance, if theunderlyingmodel is the univariatenormal distribution,then the first twosample and predicted moments must be equal. Diagnostictests in this casewould then have to be based on moments higher than the second. For theasymptotictheory to provide a good approximationto distributionof ?,,, theorder of the moments above two should be kept reasonably small.The extensionof this to other unconditionalmoments is straightforward.For central moments in the scaler case let the auxiliary criterion function be[y - pi(e)]’ minus the expected value of this quantity with respect to F( y, 0).For central moments in the multivariatecase, the auxiliary criterion function

G. Touchen, Maximumlikelihoodspe ifi utlotttest.s419would be the distinct elements of the j-fold Kronecker product of the vectory - pi( 0) with itself minus the expectation with respect to F( .y, 6).As noted by Newey (1984) in an independentpaper, moment conditions canbe used to form useful auxiliary criterion functions when the data vector ispartitionedas y’ (w’, x’) and the probabilitymodel is f,( wlx, 8). Here w is avector of jointly dependent variables and x is a vector of exogenous variables.The marginal density f,(x) for x is not specified by the model. A function ofthe formc( w, x, 8) (&Jogf,(wlx.where a(x, 0) depends onlyauxiliary criterion function isdetail the statistical propertiesregression models and limitedExample6 (X.B),on x and 8, satisfies (4). A test based on thisan ‘instrumentedscore test’. Newey examines inof such tests and presents useful applicationsfordependent variable models.2 (tail areas)In some applied work it is importantto have informationon how well theprobabilitymodel predicts tail areas. An auxiliary criterionfunctionthatprovidessuch informationcan be constructedalong the followinglines.Assume for simplicity in exposition that r is scalar though 8 may be multidimensional.Let p( 0) and a(8) denote the mean and standard deviation ofthe distributionF( y, 6). Fix (Yas a small probabilityand let z, satisfyprob,[y- (6)La(B)z,]where the subscript (x,F is self-explanatory.where I[ .] is the O-l indicatorfunction.Now putThen the statisticis the difference between the observed and the predicted frequency with whichright-handextreme values occur.As illustratedin section 5, an asymptoticallyvalid test for no difference inthe frequenciescan be computed by regressing the values F, (Y, 8,) on the‘scores’, i.e., the gradients Jf(Y,, ?,,)/a@ of the log-density function, and thenperformingthe usual t test for no intercept. Interestingly,the square of this t

420G. Tuuchen, Muximumlikelihood specijcutrontestsstatistic is asymptoticallya &i-square variate with one degree of freedom, butthe t 2 does not equal the classical Pearson statistic. The reason is that this t ’statisticproperlyaccountsfor the randomnessin 6,, where the classicalstatistic does not. The classical Pearson procedure implicitly assumes that theasymptoticvariance is ( (1 - CY)which exceeds the true variance. [When themodel is (y - CL),where is the standard normal pdf, then the variance isa(1 - CX)- (p(1z,)2.] Put another way, the classical procedure ignores the randomness in O,, and treats (I/n)C c( ,8,) as if it has the same asymptoticdistributionas (l/n)C;c(Y,, 0,) which is a ‘Durbin’ problem that leads to theincorrect expression for the asymptotic variance.A more general &i-square goodness of fit test is as follows. Suppose the datavector is of the form y’ (w’, x’) where, in a notationconsistentwith thatused at the end of Example 1, the vector w contains the jointly dependentvariables and x the exogenous variables. The probabilitymodel is the conditional density f,( wlx, 0) of the dependent variables given x, with the marginaldensity f2( x) not specified. Let the componentsof the K x 1 auxiliary criterionfunction bec,(y,8) c,(w,x,8) I[wER (x,8)]-v()or.k I,2 ,.,where Z[ ] is the O-l indicator function, the rO’okare fixed probabilitiesthat cc trO, 1, and the regions Rk( x, 6) are chosen so thatK,suchZJfwER,(x,e)]f,(w)x,8)dw ,,for each k 1,2,. . . , K. Then the K x 1 vector?n ;-c(w,,x,,8n)r lcontainsthe differences between the observed and expected frequencies.Theregression-basedmethod described in section 5 can be used to constructanasymptoticallyvalid chi-squarestatistic based on ?,,. This test is based onrandomcell boundaries[Moore and Spruill (1975)] and it accounts for covariatesX. It differs from Heckman’s(1984) test because here the regionsR(x, 0) depend not only on x but also on 13. More specifically, here theprobabilitiesare viewed as fixed and the regions then determined,whereasHeckmanviews the regions R(x) as given independentlyof 0, and then theprobabilitiesT ( x, 6) ]Z[ w E R( x)]f( x, 8) dw are determined.The asymptotic theory of this paper is general enough to cover the case when the test isset up in Heckman’s manner, but there may be advantages to setting it up theother way. First, with a priori fixed probabilitiesthe test outcome could be

421G. Tuuehen, Maximum likelihood speci& tron testseasier to interpret and provide better diagnostic information.Second, with thissetup the user can choose the probabilitiesso that noOk l/K, i.e., so that theregions are equiprobable,which is a method that has been shown to haveoptimumproperties[Kendall and Stewart (1973, ch. 30)] in the case with nocovariates.Example3 (White’sinformationmatrixtest)To include White’s test in this setup, take as the auxiliary criterion functionc( y, B) the vector function comprised of the distinct elements of the symmetricmatrix functionwhere h( y, S) dl( y, 0)/&9 is the gradient of the log-densitythe function c defined in this fashion the vectorfunction.Withcontains all of the differences between the distinct elements of the two naturalestimatesof the informationmatrix. White derives an estimatorfor theasymptoticvariance-covariancematrix of this ?,, that requires the user tocalculate analyticalthird-order partial derivatives of the log-density function.In section 3 it is shown that there is an extension of the classical informationequality which, as also noted by Chesher (1983) and Lancaster (1984) eliminatesthe need for third partials and leads to regressionbased proceduresforconductingWhite’s test.The remainder of this paper is organized as follows. Sections 2 and 3 presentthe consistencyand asymptoticnormalityresults. Section 4 examines somemeasures of the performanceof the specificationtests. Section 5 presents theregression-basedprocedurefor conductingthe specificationtests discussedhere. Section 6 contains some concluding remarks.For the sake of completeness,the various assumptionswhich were eitherimplicit or explicit in this introductionare now listed in one place.Assumption(9(ii)1The observeddata Y,. Y,, . . , Y, are iid m X 1 randomvectors withdistributionfunction G on Rm.The probabilitymodel is the family of distribution.functions{ F( y, 8):y E R”, 13E 0 c RP}, where the parameter space 0 is a compact convexsubset of RP with a non-emptyinterior.

G. Tauchen, Maximum likelihood specijcation422(iii) Both G(y) and F( y, 8) are absolutelymeasure p(y) on Rm with generalizedtestscontinuouswith respect to some(Radon-Nikodym)densities de-noted by g(y) dG(y)/dp(y)and f(y, 0) dF(y, W/dp(y).criterion functionsatisfies /c( y, 0) dF( y, 0) 0 for each(iv) The auxiliaryee 0.2. Consistencyfunction andAs in the introductionlet 1: R” X 0 R ’ be the log-densitylet c: R” X 0 R” be the auxiliary criterion function. The QML estimator 8,and the statistic ,, are defined by(9where4, andL,are the functionsL,(e) J&q,e), ,(e) 1 L(r;,e).1The key step in proving the consistency results is to establish the almost sureconvergenceof L,(B)and #,(e)to their expectationsuniformlyin theparameter8. The almost sure convergence of 8, and ” to well defined limitswill then follow from assumptionsguaranteeingthat the almost sure limit ofthe functionL, has a unique maximum.It proves useful to identify a large class of vector-valuedfunctionsonR” x 0 for which uniform almost sure convergencewill hold.Dejinition1.A function : R” x 0 Rk is said to be regular if (y, 0) is measurable in y for each 8 E 0,(p is separable [see Huber (1967, p. 222)],(iii) (p is dominated,Ic#J(y, 0)l I b(y), where the function b is integrable withrespect to G,(iv) is almost surely continuous in the sense that for each fixed 8 the set { y:lim y &( y, y) ( y, 0)) has probabilityl(dG). The null set may dependon e.The measurabilityand de conditions.Theensures that the expectationx(e) j- bJ Wy)(i) and (ii) are weakdominationassumptionand(iii)(7)

G. Tuuchen, Maximumlikelihood specificatmn423testsexists, while the almost sure continuitycondition(iv) implies by dominatedconvergencethat A is a continuousfunction of 8. As the following lemmaindicates, sample averages of ( yl, 8) have the requisite convergence propertiesif C#Iis regular.LemmaI.convergesIf 9 is regular,uniformlydmostThe next two assumptionsn.7n-then the functionsurely to functioncontainh in (7),the conditions(Proof:Appendix.)for consistencyof 8, andAssumption 2. The auxiliary criterion function c is regular and the log-densityfunctionI satisfies (i)-(iii) of Definition1 and a stronger version of (iv),namely, I( y, C?) is continuousin 8 for all y.The stronger continuityassumptionis needed for the log-densityfunction inorder to ensure that the maximizing0, for L, in (5) exists for all n. Theweaker continuityconditionfor the auxiliary criterionfunctionc sufficesbecause the existence of 4,, in (6) is guaranteedonce the existence of 8, isestablished.Definethe functionse ] j-lb, 0) dG(y),(gal4(e) E[c@‘P)] jc(y, (y),(8b)L(B) E[f(K,both of which exist and are continuousby Assumptionknown(l/n)Cyc(y,that L,,(B)zL(B)and q,(O) 2. From Lemmae) ‘ (O)uniformly1 it isin 8.By continuityand compactnessthe limiting function L achieves its maximumat least once in the parameter space 0. For the limit of the estimator d,, to bewell defined, it is necessary to assume that there is only one such maximum.Assumption 3. The limiting quasi-loglikelihoodfunction L achievesmum uniquely at 8 in the interior of the parameter space.its maxi-

424G. Tauchen, MaximumThe basic consistencyTheoremresult is:/c(y,@dG(y).nThe convergenceBurguette,sincetests,. a.s.0, * 0 and F,,‘:?, where1.-f J,@) Proof.likelihood specijicutionGallantJ/ is continuous,(9)a.s.-of 8, 0follows from arguments similar to those ina.s.and Souza (1982). Since qn( 8) --) #( 0) uniformly in 6 andthen by standardarguments#,(r?,)“z#(a) !F as givenin (9).Note that this theorem covers both the null case in which the model iscorrectly specified and there exists 8, E 0 such that f (y, 0,) is a version ofg( y ), and it covers the non-null case in which no such 0, exist. In the null case,# 8, and the almost sure limit of ?, isby the constructionof the auxiliary criterion function. In the non-nullalmost sure limit of , is It/(e), which is in general non-zero.3. Asymptoticcase thenormality3.1. The joint asymptoticdistributionof 8, and F,,In order to allow for a large class of auxiliarycriterionfunctions- inparticular,those based on frequency counts or absolute moments - the conditions for asymptoticnormalitythat are placed on the auxiliarycriterionfunction c( y, 0) do not require differentiabilitywith respect to 8. Instead, theconditionsonly require c to satisfy certain Huber-typeLipshitz conditions andG(0) /c( y, 8)dG( y) to be a continuouslydifferent

Diagnostic tests in this case would then have to be based on moments higher than the second. For the asymptotic theory to provide a good approximation to distribution of ?,,, the order of the moments above two should be kept reasonably small. The extension of this to other unconditional moments is straightforward. .

Related Documents:

Diagnostic testing equipment Non-physician personnel described in 42 CFR section 410.33(c) or This is because the provider or supplier is responsible for providing the appropriate level of physician supervision for the diagnostic testing. Multi-State Independent Diagnostic Testing Facilities

HOW A POERFUL E-COMMERCE TESTING STRATEGY 7 HITEPAPER 4.3 Obtaining Strong Non-Functional Testing Parameters Retailers also need to focus on end-user testing and compatibility testing along with other non-functional testing methods. Performance testing, security testing, and multi-load testing are some vital parameters that need to be checked.

EN 571-1, Non-destructive testing - Penetrant testing - Part 1: General principles. EN 10204, Metallic products - Types of inspection documents. prEN ISO 3059, Non-destructive testing - Penetrant testing and magnetic particle testing - Viewing conditions. EN ISO 3452-3, Non-destructive testing - Penetrant testing - Part 3: Reference test blocks.

Assessment, Penetration Testing, Vulnerability Assessment, and Which Option is Ideal to Practice? Types of Penetration Testing: Types of Pen Testing, Black Box Penetration Testing. White Box Penetration Testing, Grey Box Penetration Testing, Areas of Penetration Testing. Penetration Testing Tools, Limitations of Penetration Testing, Conclusion.

90791 HN Diagnostic Assessment - Standard, furnished by a qualified Clinical Trainee 9097 1 52 Diagnostic Assessment - Brief 90971 52, HN Diagnostic Assessment - Brief furnished by a qualified Clinical Trainee when licensing and supervision requirements are met 90791 TG Diagnostic Assessment – Extended

Section 2 Evaluation Essentials covers the nuts and bolts of 'how to do' evaluation including evaluation stages, evaluation questions, and a range of evaluation methods. Section 3 Evaluation Frameworks and Logic Models introduces logic models and how these form an integral part of the approach to planning and evaluation. It also

POINT METHOD OF JOB EVALUATION -- 2 6 3 Bergmann, T. J., and Scarpello, V. G. (2001). Point schedule to method of job evaluation. In Compensation decision '. This is one making. New York, NY: Harcourt. f dollar . ' POINT METHOD OF JOB EVALUATION In the point method (also called point factor) of job evaluation, the organizationFile Size: 575KBPage Count: 12Explore further4 Different Types of Job Evaluation Methods - Workologyworkology.comPoint Method Job Evaluation Example Work - Chron.comwork.chron.comSAMPLE APPLICATION SCORING MATRIXwww.talent.wisc.eduSix Steps to Conducting a Job Analysis - OPM.govwww.opm.govJob Evaluation: Point Method - HR-Guidewww.hr-guide.comRecommended to you b

This book is designed to help you in the basic diagnostic procedures for Mercedes Benz. It is intended to be a starting point in the diagnostic process and is not intended to be a complete resource. THE DIAGNOSTIC PROCESS The diagnostic process divides itself into several levels; Information Gathering, Analysis of Codes, Testing and then Repair.