Generalized Method Of Moments - University Of Washington

2y ago
18 Views
2 Downloads
600.78 KB
61 Pages
Last View : 7d ago
Last Download : 3m ago
Upload by : Kairi Hasson
Transcription

This is page iPrinter: Opaque this1Generalized Method of Moments1.1 IntroductionThis chapter describes generalized method of moments (GMM) estimation for linear and non-linear models with applications in economics andfinance. GMM estimation was formalized by Hansen (1982), and since hasbecome one of the most widely used methods of estimation for modelsin economics and finance. Unlike maximum likelihood estimation (MLE),GMM does not require complete knowledge of the distribution of the data.Only specified moments derived from an underlying model are needed forGMM estimation. In some cases in which the distribution of the data isknown, MLE can be computationally very burdensome whereas GMM canbe computationally very easy. The log-normal stochastic volatility model isone example. In models for which there are more moment conditions thanmodel parameters, GMM estimation provides a straightforward way to testthe specification of the proposed model. This is an important feature thatis unique to GMM estimation.This chapter is organized as follows. GMM estimation for linear modelsis described in Section 1.2. Section 1.3 describes methods for estimating theefficient weight matrix. Sections 1.4 and 1.5 give examples of estimation andinference using the S Finmetrics function GMM. Section 1.6 describes GMMestimation and inference for nonlinear models. Section 1.7 provides numerous examples of GMM estimation of nonlinear models in finance including Euler equation asset pricing models, discrete-time stochastic volatilitymodels, and continous-time interest rate diffusion models.

ii1. Generalized Method of MomentsThe theory and notation for GMM presented herein follows the excellent treatment given in Hayashi (2000). Other good textbook treatments ofGMM at an intermediate level are given in Hamilton (1994), Ruud (2000),Davidson and MacKinnon (2004), and Greene (2004). The most comprehensive textbook treatment of GMM is Hall (2005). Excellent surveys ofrecent developments in GMM are given in the special issues of the Journalof Business and Economic Statistics (1996, 2002). Discussions of GMMapplied to problems in finance are given in Ogaki (1992), Ferson (1995),Andersen and Sorensen (1996), Campbell, Lo and MacKinlay (1997), Jamesand Webber (2000), Cochrane (2001), Jagannathan and Skoulakis (2002),and Hall (2005).1.2 Single Equation Linear GMMConsider the linear regression modelyt z0t δ 0 εt , t 1, . . . , n(1.1)where zt is an L 1 vector of explanatory variables, δ 0 is a vector ofunknown coefficients and εt is a random error term. The model (1.1) allowsfor the possibility that some or all of the elements of zt may be correlatedwith the error term εt , i.e., E[ztk εt ] 6 0 for some k. If E[ztk εi ] 6 0 thenztk is called an endogenous variable. It is well known that if zt containsendogenous variables then the least squares estimator of δ 0 in (1.1) is biasedand inconsistent.Associated with the model (1.1), it is assumed that there exists a K 1vector of instrumental variables xt which may contain some or all of theelements of zt . Let wt represent the vector of unique and non-constantelements of {yt , zt , xt }. It is assumed that {wt } is a stationary and ergodicstochastic process.The instrumental variables xt satisfy the set of K orthogonality conditionsE[gt (wt , δ 0 )] E[xt εt ] E[xt (yt z0t δ 0 )] 0(1.2)where gt (wt , δ 0 ) xt εt xt (yt z0t δ 0 ). Expanding (1.2), gives the relationΣxy Σxz δ 0where Σxy E[xt yt ] and Σxz E[xt z0t ]. For identification of δ 0 , it isrequired that the K L matrix E[xt z0t ] Σxz be of full rank L. This rankcondition ensures that δ 0 is the unique solution to (1.2). Note, if K L,then Σxz is invertible and δ 0 may be determined usingδ 0 Σ 1xz Σxy

1.2 Single Equation Linear GMMiiiA necessary condition for the identification of δ 0 is the order conditionK L(1.3)which simply states that the number of instrumental variables must begreater than or equal to the number of explanatory variables in (1.1). IfK L then δ 0 is said to be (apparently) just identified; if K L then δ 0is said to be (apparently) over-identified; if K L then δ 0 is not identified.The word “apparently” in parentheses is used to remind the reader thatthe rank conditionrank(Σxz ) L(1.4)must also be satisfied for identification.In the regression model (1.1), the error terms are allowed to be conditionally heteroskedastic as well as serially correlated. For the case in whichεt is conditionally heteroskedastic, it is assumed that {gt } {xt εt } is astationary and ergodic martingale difference sequence (MDS) satisfyingE[gt gt0 ] E[xt x0t ε2t ] Swhere S is a non-singular K K matrix. The matrix S is Pthe asymptoticnvariance-covariance matrix of the sample moments ḡ n 1 t 1 gt (wt , δ 0 ).This follows from the central limit theorem for ergodic stationary martingale difference sequences (see Hayashi page 106)n 1 Xdnḡ xt εt N (0, S)n t 1where avar(ḡ) S denotes the variance-covariance matrix of the limitingdistribution of nḡ.For the case in which εt is serially correlated and possibly conditionallyheteroskedastic as well, it is assumed that {gt } {xt εt } is a stationaryand ergodic stochastic process that satisfies nḡn S 1 Xd xt εt N (0, S)n t 1 XΓj Γ0 j 0E[gt gt j] X(Γj Γ0j )j 1E[xt x0t j εt εt j ].where Γj In the above, avar(ḡ) S isalso referred to as the long-run variance of ḡ.1.2.1 Definition of the GMM EstimatorThe generalized method of moments (GMM) estimator of δ in (1.1) is constructed by exploiting the orthogonality conditions (1.2). The idea is to create a set of estimating equations for δ by making sample moments match

iv1. Generalized Method of Momentsthe population moments defined by (1.2). The sample moments based on(1.2) for an arbitrary value δ arenn1X1Xg(wt , δ) xt (y z0t δ)n t 1n t 1 1 Pn0t 1 x1t (y zt δ)n . .Pn10x(y zδ)tt 1 Ktngn (δ) These moment conditions are a set of K linear equations in L unknowns.Equating these sample moments to the population moment E[xt εt ] 0gives the estimating equationsSxy Sxz δ 0(1.5)PPnn n 1 t 1 xt yt and Sxz n 1 t 1 xt z0t are the sample mo-where Sxyments.If K L (δ 0 is just identified) and Sxz is invertible then the GMMestimator of δ isδ̂ S 1xz Sxywhich is also known as the indirect least squares estimator. If K Lthen there may not be a solution to the estimating equations (1.5). In thiscase, the idea is to try to find δ that makes Sxy Sxz δ as close to zero aspossible. To do this, let Ŵ denote a K K symmetric and positive definitep(p.d.) weight matrix, possibly dependent on the data, such that Ŵ Was n with W symmetric and p.d. Then the GMM estimator of δ,denoted δ̂(Ŵ), is defined asδ̂(Ŵ) arg min J(δ, Ŵ)δwhereJ(δ, Ŵ) ngn (δ)0 Ŵgn (δ) n(Sxy Sxz δ)0 Ŵ(Sxy Sxz δ)(1.6)Since J(δ, Ŵ) is a simple quadratic form in δ, straightforward calculusmay be used to determine the analytic solution for δ̂(Ŵ) :δ̂(Ŵ) (S0xz ŴSxz ) 1 S0xz ŴSxy(1.7)Asymptotic PropertiesUnder standard regularity conditions (see Hayashi Chapter 3), it can beshown thatpδ̂(Ŵ) δ 0 ³dn δ̂(Ŵ) δ 0 N (0, avar(δ̂(Ŵ)))

1.2 Single Equation Linear GMMvwhereavar(δ̂(Ŵ)) (Σ0xz WΣxz ) 1 Σ0xz WSWΣxz (Σ0xz WΣxz ) 1(1.8)A consistent estimate of avar(δ̂(Ŵ)), denoted a[var(δ̂(Ŵ)), may be computed usinga[var(δ̂(Ŵ)) (S0xz ŴSxz ) 1 S0xz ŴŜŴSxz (S0xz ŴSxz ) 1(1.9)where Ŝ is a consistent estimate for S avar(ḡ).The Efficient GMM EstimatorFor a given set of instruments xt , the GMM estimator δ̂(Ŵ) is definefor an arbitrary positive definite and symmetric weight matrix Ŵ. Theasymptotic variance of δ̂(Ŵ) in (1.8) depends on the chosen weight matrixŴ. A natural question to ask is: What weight matrix W produces thesmallest value of avar(δ̂(Ŵ))? The GMM estimator constructed with thisweight matrix is called the efficient GMM estimator . Hansen (1982) showedthat efficient GMM estimator results from setting Ŵ Ŝ 1 such thatpŜ S. For this choice of Ŵ, the asymptotic variance formula (1.8) reducestoavar(δ̂(Ŝ 1 )) (Σ0xz S 1 Σxz ) 1(1.10)of which a consistent estimate isa[var(δ̂(Ŝ 1 )) (S0xz Ŝ 1 Sxz ) 1(1.11)The efficient GMM estimator is defined as0δ̂(Ŝ 1 ) arg min ngn (δ) Ŝ 1 gn (δ)δwhich requires a consistent estimate of S. However, consistent estimation ofS, in turn, requires a consistent estimate of δ. To see this, consider the casein which εt in (1.1) is conditionally heteroskedastic so that S E[gt gt0 ] E[xt x0t ε2t ]. A consistent estimate of S has the form1nŜ n1X1Xxt x0t ε̂2t xt x0t (yt z0t δ̂)2n t 1n t 1psuch that δ̂ δ. Similar arguments hold for the case in which gt xt εt isa serially correlated and heteroskedastic process.1 Davidson and MacKinnon (1993, section 16.3) suggest using a simple degrees-offreedom corrected estimate of S that replaces n 1 in (1.17) with (n k) to improve thefinite sample performance of tests based on (1.11).

vi1. Generalized Method of MomentsTwo Step Efficient GMMThe two-step efficient GMM estimator utilizes the result that a consistentestimate of δ may be computed by GMM with an arbitrary positive definitepand symmetric weight matrix Ŵ such that Ŵ W. Let δ̂(Ŵ) denotesuch an estimate. Common choices for Ŵ are Ŵ Ik and Ŵ S 1xx (n 1 X0 X) 1 , where X is an n k matrix with tth row equal to x0t 2 . Then,a first step consistent estimate of S is given bynŜ(Ŵ) 1Xxt x0t (yt z0t δ̂(Ŵ))2n t 1(1.12)The two-step efficient GMM estimator is then defined as0δ̂(Ŝ 1 (Ŵ)) arg min ngn (δ) Ŝ 1 (Ŵ)gn (δ)δ(1.13)Iterated Efficient GMMThe iterated efficient GMM estimator uses the two-step efficient GMMestimator δ̂(Ŝ 1 (Ŵ)) to update the estimation of S in (1.12) and thenrecomputes the estimator in (1.13). The process is repeated (iterated) untilthe estimates of δ do not change significantly from one iteration to thenext. Typically, only a few iterations are required. The resulting estimator 1is denoted δ̂(Ŝiter ). The iterated efficient GMM estimator has the sameasymptotic distribution as the two-step efficient estimator. However, infinite samples the two estimators may differ. As Hamilton (1994, page 413)points out, the iterated GMM estimator has the practical advantage overthe two-step estimator in that the resulting estimates are invariant withrespect to the scale of the data and to the initial weighting matrix Ŵ.Continuous Updating Efficient GMMThis estimator simultaneously estimates S, as a function of δ, and δ. It isdefined as0 1δ̂(Ŝ 1(δ)gn (δ)(1.14)CU ) arg min ngn (δ) Ŝδwhere the expression for Ŝ(δ) depends on the estimator used for S. Forexample, with conditionally heteroskedastic errors Ŝ(δ) takes the formnŜ(δ) 1Xxt x0t (yt z0t δ)2n t 12 In the function GMM, the default initial weight matrix is the identity matrix.This can be changed by supplying a weight matrix using the optional argumentw my.weight.matrix. Using Ŵ S 1xx is often more numerically stable than usingŴ Ik .

1.2 Single Equation Linear GMMviiHansen, Heaton and Yaron (1996) call δ̂(Ŝ 1CU ) the continuous updating(CU) efficient GMM estimator . This estimator is asymptotically equivalentto the two-step and iterated estimators, but may differ in finite samples.The CU efficient GMM estimator does not depend on an initial weight matrix W, and, like the iterated efficient GMM estimator, the numerical valueof CU estimator is invariant to the scale of the data. It is computationallymore burdensome than the iterated estimator, especially for large nonlinear models, and is more prone to numerical instability. However, Hansen,Heaton and Yaron find that the finite sample performance of the CU estimator, and test statistics based on it, is often superior to the other estimators.The good finite sample performance of the CU estimator relative to the iterated GMM estimator may be explained by the connection between theCU estimator and empirical likelihood estimators. See Imbens (2002) andNewey and Smith (2004) for further discussion on the relationship betweenGMM estimators and empirical likelihood estimators.The J-StatisticThe J-statistic, introduced in Hansen (1982), refers to the value of theGMM objective function evaluated using an efficient GMM estimator:J J(δ̂(Ŝ 1 ), Ŝ 1 ) ngn (δ̂(Ŝ 10 1)) Ŝ 1 gn (δ̂(Ŝ))(1.15)where δ̂(Ŝ 1 ) denotes any efficient GMM estimator of δ and Ŝ is a consistent estimate of S. If K L then J 0, and if K L then J 0. Underregularity conditions (see Hayashi chapter 3) and if the moment conditions(1.2) are valid, then as n dJ χ2 (K L)Hence, in a well specified overidentified model with valid moment conditionsthe J statistic behaves like a chi-square random variable with degrees offreedom equal to the number of overidentifying restrictions. If the modelis mis-specified and or some of the moment conditions (1.2) do not hold(e.g., E[xit εt ] E[xit (yt z0t δ 0 )] 6 0 for some i) then the J statistic willbe large relative to a chi-square random variable with K L degrees offreedom.The J statistic acts as an omnibus test statistic for model mis-specification. A large J statistic indicates a mis-specified model. Unfortunately, theJ statistic does not, by itself, give any information about how the modelis mis-specified.Normalized MomentsIf the model is rejected by the J statistic, it is of interested to know whythe model is rejected. To aid in the diagnosis of model failure, the magnitudes of the individual elements of the normalized moments ngn (δ̂(Ŝ 1 ))

viii1. Generalized Method of Momentsmay point the reason why the model is rejected by the J statistic. Under the null hypothesis that the model is correct and the orthogonalityconditions are valid, the normalized moments satisfy dngn (δ̂(Ŝ 1 )) N (0, S Σxz [Σ0xz S 1 Σxz ] 1 Σ0xz )As a result, for a well specified model the individual moment t ratiosti gn (δ̂(Ŝ 1 ))i /SE(gn (δ̂(Ŝ 1 ))i ), i 1, . . . , K(1.16)whereSE(gn (δ̂(Ŝ 1 ))i ³hi 1/2Ŝ Σ̂xz [Σ̂0xz Ŝ 1 Σ̂xz ] 1 Σ̂0xz /Tiiare asymptotically standard normal. When the model is rejected using theJ statistic, a large value of ti indicates mis-specification with respect tothe ith moment condition. Since the rank of S Σxz [Σ0xz S 1 Σxz ] 1 Σ0xz isK L, the interpretation of the moment t ratios (1.16) may be difficultin models for which the degree of over-identification is small. In particular,if K L 1 then t1 · · · tK .Two Stage Least Squares as Efficient GMMIf, in the linear GMM regression model (1.1), the errors are conditionallyhomoskedastic thenE[xt x0t ε2t ] σ 2 Σxx SpA consistent estimate of S has the form Ŝ σ̂2 Sxx where σ̂2 σ 2 . Typically,nXσ̂ 2 n 1(yt z0t δ)2t 1where δ̂ δ 0 . The efficient GMM estimator becomes:0 2 1δ̂(σ̂ 2 S 1Sxx Sxz ) 1 S0xz σ 2 S 1xx ) (Sxz σxx Sxy0 1 1 0 1 (Sxz Sxx Sxz ) Sxz Sxx Sxy δ̂(S 1xx )which does not depend on σ̂ 2 . The estimator δ̂(S 1xx ) is, in fact, identicalto the two stage least squares (TSLS) estimator of δ :0 1 1 0δ̂(S 1Sxz S 1xx ) (Sxz Sxx Sxz )xx Sxy0 1 0 (Z PX Z) Z PX y δ̂ T SLSwhere Z denotes the n L matrix of observations with tth row z0t , X denotesthe n k matrix of observations with tth row x0t , and PX X(X0 X) 1 X0is the idempotent matrix that projects onto the columns of X.

1.3 Estimation of SixUsing (1.10), the asymptotic variance of δ̂(S 1xx ) δ̂ T SLS is 1avar(δ̂ T SLS ) (Σ0xz S 1 Σxz ) 1 σ 2 (Σ0xz Σ 1xx Σxz )2Although δ̂(S 1xx ) does not depend on σ̂ , a consistent estimate of theasymptotic variance does: 1a[var(δ̂ T SLS ) σ̂ 2 (S0xz S 1xx Sxz )Similarly, the J-statistic also depends on σ̂ 2 and takes the formJ(δ̂ T SLS , σ̂ 2 S 1xx ) n(sxy Sxz δ̂ T SLS )0 S 1xx (sxy Sxz δ̂ T SLS )σ̂ 2The TSLS J-statistics is also known as Sargan’s statistic (see Sargan 1958).1.3 Estimation of STo compute any of the efficient GMM estimators, a consistent estimateof S avar(ḡ) is required. The method used to estimate S depends onthe time series properties of the population moment conditions gt . Twocases are generally considered. In the first case, gt is assumed to be seriallyuncorrelated but may be conditionally heteroskedastic. In the second case,gt is assumed to be serially correlated as well as potentially conditionallyheteroskedastic. The following sections discuss estimation of S in these twocases. Similar estimators were discussed in the context of linear regressionin Chapter 6, Section 5. In what follows, the assumption of a linear model(1.1) is dropped and the K moment conditions embodied in the vectorgt are assumed to be nonlinear functions of q K model parameters θand are denoted gt (θ). ThePnmoment conditions satisfy E[gt (θ0 )] 0 andS avar(ḡ) avar(n 1 i 1 gt (θ0 )).1.3.1 Serially Uncorrelated MomentsIn many situations the population moment conditions gt (θ0 ) form an ergodicstationary MDS with an appropriate information set It . In this case,S avar(ḡ) E[gt (θ 0 )gt (θ 0 )0 ]Following White (1982), a heteroskedasticity consistent (HC) estimate of Shas the formn1XŜHC gt (θ̂)gt (θ̂)0(1.17)n t 1

x1. Generalized Method of Momentswhere θ̂ is a consistent estimate of θ 0 3 . Davidson and MacKinnon (1993,section 16.3) suggest using a simple degrees-of-freedom corrected estimateof S that replaces n 1 in (1.17) with (n k) 1 to improve the finite sampleperformance of tests based on (1.11).1.3.2 Serially Correlated MomentsIf the population moment conditions gt (θ 0 ) are an ergodic-stationary butserially correlated process thenS avar(ḡ) Γ0 X(Γj Γ0j )j 10where Γj E[gt (θ 0 )gt j (θ0 ) ]. In this case a heteroskedasticity and autocorrelation consistent (HAC) estimate of S has the formŜHACn 11X wj,n (Γ̂j (θ̂) Γ̂0j (θ̂))n j 1where wj,n (j 1, . . . , bn ) are kernel function weights, bn is a non-negativebandwidth parameter that may depend on the sample size, Γ̂j (θ̂)Pn n1 t j 1 gt (θ̂)gt j (θ̂)0 , and θ̂ is a consistent estimate of θ0 . DifferentHAC estimates of S are distinguished by their kernel weights and bandwidth parameter. The most common kernel functions are listed in Table1.1. For all kernels except the quadratic spectral, the integer bandwidthparameter, bn , acts as a lag truncation parameter and determines howmany autocovariance matrices to include when forming ŜHAC . Figure 1.1illustrates the first ten kernel weights for the kernels listed in Table 1.1evaluated using the default values of bn for n 100.The choice of kerneland bandwidth determine the statistical properties of ŜHAC . The truncatedkernel is often used if the moment conditions follow a finite order movingaverage process. However, the resulting estimate of S is not guaranteedto be positive definite. Use of the Bartlett, Parzen or quadratic spectralkernels ensures that ŜHAC will be positive semi-definite. For these kernels,Andrews (1991) studied the asymptotic properties ŜHAC . He showed thatŜHAC is consistent for S provided that bn as n . Furthermore,for each kernel, Andrews determined the rate at which bn to asymptotically minimize M SE(ŜHAC , S). For the Bartlett, Parzen and quadraticspectral kernels the rates are n1/3 , n1/5 , and n1/5 , respectively. Using theoptimal bandwidths, Andrews found that the ŜHAC based on the quadraticspectral kernel has the smallest asymptotic MSE, followed closely by ŜHACbased on the Parzen kernel.3 For example, θ̂ may be an inefficient GMM estimate based on an arbitrary p.d.weight matrix.

1.3 Estimation of SKernelTruncatedBartlettParzenDefault bnh ¡ in 1/5int 4 · 100h ¡ in 1/4int 4 · 100xiwj,n1 for aj 10 for aj 11 aj for aj 10 for aj 11 6a2j 6a3j for 0 aj .52(1 aj )3 for .5 aj 10 for aj 1ihsin(mj )25 cos(mj )mj12π2 d2h ¡ in 4/25int 4 · 100h ¡ iQuadraticn 4/25int 4 · 100spectralinote: aj i/(bn 1), dj j/bn , mj 6πdi /5TABLE 1.1. Common kernel weights and default bandwidthsAutomatic Bandwidth SelectionBased on extensive Monte Carlo experiments, Newey and West (1994) conclude that the choice of bandwi

vi 1. Generalized Method of Moments Two Step Efficient GMM The two-step efficient GMM estimator utilizes the result that a consistent estimate of δmay be computed by GMM with an arbitrary positive definite and symmetric weight matrix Wˆ such that Wˆ p W.Letˆδ(Wˆ )denote such an e

Related Documents:

EPA Test Method 1: EPA Test Method 2 EPA Test Method 3A. EPA Test Method 4 . Method 3A Oxygen & Carbon Dioxide . EPA Test Method 3A. Method 6C SO. 2. EPA Test Method 6C . Method 7E NOx . EPA Test Method 7E. Method 10 CO . EPA Test Method 10 . Method 25A Hydrocarbons (THC) EPA Test Method 25A. Method 30B Mercury (sorbent trap) EPA Test Method .

Method of Moments. Method of Moments. 1 2. Calculate low-order moments, as functions of θ Set up a system of equations setting the population moments (as functions of the parameters in step 1) equal to the sample moments, and derive expressions for the parameters as functions of the

The method of moments results from the choices m(x) xm. Write µ m EXm k m( ). (13.1) for the m-th moment. Our estimation procedure follows from these 4 steps to link the sample moments to parameter estimates. Step 1. If the model has d parameters, we compute the functions k m in equation (13.1) for the first d moments, µ 1 k 1( 1 .File Size: 1MB

This chapter outlines the large-sample theory of Generalized Method of Moments (GMM) estimation and hypothesis testing. The properties of consistency and asymptotic normality (CAN) of GMM estimates hold under regularity conditions much like those under which maximum

Generalized Method of Moments with R Pierre Chauss e November 5, 2020 . making the distinction because the e cient GMM can be obtained in one step. We will discuss estimation below. . suppose we want to estimate the mean and variance of a normal distribution using the foll

Motivation MM & IV GMM & 2SLS Asymptotics Testing GMM & ML Outline 1 Motivation 2 MM & IV 3 GMM & 2SLS 4 Asymptotics 5 Testing 6 GMM & ML Anton Velinov Single-Equation Generalized Method of Moments (GMM) 2/28

boundary conditions following the standard finite element procedure. In addition the enrichment functions are easily obtained. 2. GENERALIZED FINITE ELEMENT METHOD The Generalized Finite Element Method (GFEM) is a Galerkin method whose main goal is the construction of a fin

ANNUAL BOOK OF ASTM STANDARDS IHS, 15 Inverness Way East, Englewood, CO 80112 Efficiently access the ASTM Book of Standards online and obtain the latest standards with just a few clicks of your mouse! You get: 24/7 Online Access (Unlimited users) and searching Ability to find and search a single standard, entire volume or section. Daily content updates No limit on the number .