GEE: Generalized Estimating Equations (Liang & Zeger, 1986 .

3y ago
50 Views
3 Downloads
1.60 MB
50 Pages
Last View : 1m ago
Last Download : 3m ago
Upload by : Casen Newsome
Transcription

GEE for Longitudinal Data - Chapter 8 GEE: generalized estimating equations (Liang & Zeger, 1986;Zeger & Liang, 1986) extension of GLM to longitudinal data analysis usingquasi-likelihood estimation method is semi-parametric– estimating equations are derived without full specificationof the joint distribution of a subject’s obs (i.e., y i) instead, specification of– likelihood for the (univariate) marginal distributions of yij– “working” correlation matrix for the vector of repeatedobservations from each subject1

Ballinger G.A. (2004). Using generalized estimating equations for longitudinal dataanalysis, Organizational Research Methods, 7:127-150. Diggle P.J., Heagerty P., Liang K.-Y., Zeger S.L. (2002). Analysis of LongitudinalData, 2nd edition, New York: Oxford University Press. Dunlop D.D. (1994). Regression for longitudinal data: a bridge from least squaresregression, The American Statistician, 48:299-303. Hardin J.W., Hilbe J.M. (2003). Generalized Estimating Equations, New York:Chapman and Hall. Hu F.B., Goldberg J., Hedeker D., Flay B.R., Pentz M.A. (1998). A comparison ofgeneralized estimating equation and random-effects approaches to analyzing binaryoutcomes from longitudinal studies: illustrations from a smoking prevention study,American Journal of Epidemiology, 147:694-703.available at: http://www.uic.edu/classes/bstt/bstt513/pubs.html Norton E.C., Bieler G.S., Ennett S.T., Zarkin G.A. (1996). Analysis of preventionprogram effectiveness with clustered data using generalized estimating equations, Journalof Consulting and Clinical Psychology, 64:919-926. Sheu C.-F. (2000). Regression analysis of correlated binary outcomes, BehaviorResearch Methods, Instruments, and Computers, 32:269-273. Zorn C.J.W. (2001). Generalized estimating equation models for correlated data: areview with applications, American Journal of Political Science, 45:470-490.2

GEE Overview GEEs have consistent and asymptotically normal solutions,even with mis-specification of the correlation structure Avoids need for multivariate distributions by only assuming afunctional form for the marginal distribution at eachtimepoint (i.e., yij ) The covariance structure is treated as a nuisance Relies on the independence across subjects to estimateconsistently the variance of the regression coefficients (evenwhen the assumed correlation structure is incorrect)3

GEE Method outline1. Relate the marginal reponse µij E(yij ) to a linearcombination of the covariatesg(µij ) x0ij β yij is the response for subject i at time j xij is a p 1 vector of covariates β is a p 1 vector of unknown regression coefficients g(·) is the link function2. Describe the variance of yij as a function of the meanV (yij ) v(µij )φ φ is a possibly unknown scale parameter v(·) is a known variance function4

Link and Variance Functions Normally-distributed responseg(µij ) µij “Identity link”v(µij ) 1V (yij ) φ Binary response (Bernoulli)g(µij ) log[µij /(1 µij )] “Logit link”v(µij ) µij (1 µij )φ 1 Poisson responseg(µij ) log(µij ) “Log link”v(µij ) µijφ 15

Gee Method outline3. Choose the form of a n n “working” correlation matrix Rifor each y i the (j, j 0) element of Ri is the known, hypothesized, orestimated correlation between yij and yij 0 This working correlation matrix Ri may depend on a vectorof unknown parameters α, which is assumed to be the samefor all subjects Although this correlation matrix can differ from subject tosubject, we usually use a working correlation matrix Ri average dependence among the repeated observations oversubjectsaside: not well-suited to irregular measurements across timebecause time is treated categorically6

Comments on “working” correlation matrix should choose form of R to be consistent with empiricalcorrelations GEE method yields consistent estimates of regressioncoefficients β and their variances (thus, standard errors), evenwith mis-specification of the structure of the covariance matrix Loss of efficiency from an incorrect choice of R is lessened asthe number of subjects gets largeFrom O’Muircheartaigh & Francis (1981) Statistics: A Dictionary of Termsand Ideas “an estimator (of some population parameter) based on a sample of size Nwill be consistent if its value gets closer and closer to the true value of theparameter as N increases” “. the best test procedure (i.e., the efficient test) will be that with thesmallest type II error (or largest power)”7

Working Correlation Structures Exchangeable: Rjj 0 ρ, all of the correlations are equal0 j j AR(1): Rjj 0 ρ Stationary m-dependent (Toeplitz): ρ j j 0 if j j 0 mRjj 0 0if j j 0 m Unspecified (or unstructured) Rjj 0 ρjj 0– estimate all n(n 1)/2 correlations of R– most efficient, but most useful when there are relatively fewtimepoints (with many timepoints, estimation of then(n 1)/2 correlations is not parsimonious)– missing data complicates estimation of R8

GEE Estimation Define Ai n n diagonal matrix with V (µij ) as the jthdiagonal element Define Ri(α) n n “working” correlation matrix (of then repeated measures)Working variance–covariance matrix for yi equals1/21/2V (α) φAi Ri(α)AiFor normally distributed outcomes, V (α) φRi(α)9

GEE estimator of β is the solution ofNXi 1D0i [V (α̂)] 1 (y i µi) 0,where α̂ is a consistent estimate of α and D i µi/ βe.g., normal case, µi X iβ , Di X i , and V (α̂) φ̂Ri(α̂)NXi 1 β̂ NXi 1X 0i [Ri(α̂)] 1 (yi X iβ) 0, 1X 0i [Ri(α̂)] 1 X i NXi 1 X 0i [Ri(α̂)] 1 y i akin to weighted least-squares (WLS) estimator more generally, because solution only depends on the meanand variance of y, these are quasi-likelihood estimates10

GEE solutionIterate between the quasi-likelihood solution for β and a robustmethod for estimating α as a function of β1. Given estimates of Ri(α) and φ, calculate estimates of βusing iteratively reweighted LS2. Given estimates of β, obtain estimates of α and φ. For this,calculate Pearson (or standardized) residualsvuutrij (yij µ̂ij )/ [V (α̂)]jjand use these residuals to consistently estimate α and φ(Liang & Zeger, 1986, present estimators for several differentworking correlation structures)11

InferenceV (β̂): square root of diagonal elements yield std errors for β̂GEE provides two versions of these (with V̂ i denoting Vi(α̂))1. Naive or “model-based” V (β̂) N X i 1 1 0DiV̂ i Di 2. Robust or “empirical” 1 ,V (β̂) M 1MM100NX 1D0iV̂ i DiiN 1 1X00M 1 DiV̂ i (yi µ̂i)(yi µ̂i) V̂ i D iiM0 12

notice, if V̂ i (yi µ̂i)(yi µ̂i)0 then the two are equal(this occurs only if the true correlation structure is correctlymodeled) In the more general case, the robust or “sandwich” estimatorprovides a consistent estimator of V (β̂) even if the workingcorrelation structure Ri(α) is not the true correlation of y i13

GEE vs MRM GEE not concerned with V (yi) GEE yields both robust and model-based std errors for β̂;MRM, in common use, only provides model-based GEE solution for all kinds of outcomes; MRM needs to bederived for each For non-normal outcomes, GEE provides population-averaged(or marginal) estimates of β , whereas MRM yieldssubject-specific (or conditional) estimates GEE assumption regarding missing data is more stringent(MCAR) than MRM (which assumes MAR)14

Example 8.1: Using the NIMH Schizophrenia dataset, thishandout has PROC GENMOD code and output from severalGEE analyses varying the working correlation structure.(SAS code and output)http://tigger.uic.edu/ hedeker/schizgee.txt15

GEE Example: Smoking Cessation across TimeGruder, Mermelstein et al., (1993) JCCP 489 subjects measured across 4 timepoints following anintervention designed to help them quit smoking Subjects were randomized to one of three conditions– control, self-help manuals– tx1, manuals plus group meetings (i.e., discussion)– tx2, manuals plus enhanced group meetings (i.e., socialsupport) Some subjects randomized to tx1 or tx2 never showed up toany meetings following the phone call informing them ofwhere the meetings would take place dependent variable: smoking status at particular timepointwas assessed via phone interviews16

In Gruder et al., , four groups were formed for the analysis:1. Control: randomized to the control condition2. No-show: randomized to receive a group treatment, but nevershowed up to the group meetings3. tx1: randomized to and received group meetings4. tx2: randomized to and received enhanced group meetingsand these four groups were compared using Helmert contrasts:GroupControlNo-showtx1tx2H1 11/31/31/3H2 H30 0 1 01/2 11/2 117

Interpretation of Helmert ContrastsH1 : test of whether randomization to group versus controlinfluenced subsequent cessation.H2 : test of whether showing up to the group meetingsinfluenced subsequent cessation.H3 : test of whether the type of meeting influenced cessation.note: H1 is an experimental comparison, but H2 and H3 arequasi-experimentalExamination of possible confounders: baseline analysisrevealed that groups differed in terms of race (w vs nw), so racewas included in subsequent analyses involving group18

Table 8.1 Point Prevalence Rates (N ) of Abstinence overTime by GroupEnd-of-Program(T1)17.4(109)6 months(T2)7.2(97)12 months(T3)18.5(92)24 months(T4)18.2(77)No n33.7( 86)14.6( 82)16.3( 80)22.9( 70)Social Support49.0(104)20.0(100)24.0( 96)25.6( 86)GroupNo Contact Control19

Table 8.2 Correlation of Smoking Abstinence (y/n) 0.34T30.290.481.000.49T40.260.340.491.00Working Correlation choice: exchangeable does not appear like a good choice since thecorrelations are not approximately equal neither the AR(1) nor the m-dependent structures appearreasonable because the correlations within a time lag vary unspecified appears to be the most reasonable choice20

GEE models - binary outcome, logit, R UN, T 0, 1, 2, 4Model 1ηij β0 β1Tj β2Tj2 β3H1i β4H2i β5H3i β6RaceiModel 2ηij β0 β1Tj β2Tj2 β3H1i β4H2i β5H3i β6Racei β7(H1i Tj ) β8(H2i Tj ) β9(H3i Tj )Model 3ηij β0 β1Tj β2Tj2 β3H1i β4H2i β5H3i β6Racei β7(H1i Tj ) β8(H2i Tj ) β9(H3i Tj ) β10(H1i Tj2) β11 (H2i Tj2) β12(H3i Tj2)21

Table 8.3 Smoking Status (0, Smoking; 1, Not Smoking) Across Time(N 489) — GEE Logistic Parameter Estimates (Est.), Standard Errors(SE), and p-ValuesParameterIntercept β0T β1T 2 β2H1 β3H2 β4H3 β5Race β6H1 T β7H2 T β8H3 T β9H1 T 2 β10H2 T 2 β11H3 T 2 β12Model 1Est. SE .999 .112 .633 .126.132 .029.583 .170.288 .121.202 .119.358 .200p .001.001.001.001.018.091.074Model 2Est. SE 1.015 .116 .619 .127.132 .029.765 .207.334 .138.269 .138.353 .200 .142 .072 .035 .051 .050 .05322p .001.001.001.001.012.051.078.048.495.346Model 3Est. SE 1.010 .117 .631 .131.135 .030.869 .226.435 .151.274 .149.354 .200 .509 .236 .389 .187 .051 .200.087 .052.086 .043.000 .046p 95

Single- and Multi-Parameter Wald Tests1. Single-parameter test, e.g., H0 : β1 022ˆˆˆˆ β1) or X β1 /V̂ (βˆ1)z β1/se(12. Linear combination of parameters, e.g., H0 : β1 β2 0for this, suppose β 0 [βˆ0 βˆ1 βˆ2] and define c [0 1 1] 0 1 20X1 cβ̂ c V̂ (β̂) c cβ̂ Notice, 1. (H0 : β1 0) is a special case where c [0 1 0]3. Multi-parameter test, e.g., H0 : β1 β2 0 0 1 0 C 0 0 1 0 120X2 C β̂ C V̂ (β̂) C C β̂ 23

Comparing models 1 and 3, models with and without thegroup by time effects, the null hypothesis isH0 β7 β8 β9 β10 β11 β12 0 000C 000 1000000100000010 0 0 0 0 0 1 X62 10.98, p .09 Also, several of the individual group by time parameter tests are significant observed abstinence rates indicate large post-intervention group differencesthat are not maintained over time model 3 is preferred to model 124

Comparing models 2 and 3, models with and without thegroup by quadratic time effects, the null hypothesis isH0 β10 β11 β12 0 0 0 0 0 0 0 0 0 0 0 1 0 0 C 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 X32 5.91, p .12 but, individual H1 T 2 interaction (β̂10 .0870, p .096)and individual H2 T 2 interaction (β̂11 .0855, p .044) some evidence for model 3, though, strictly speaking, not quiteat the .05 level in terms of the multi-parameter Wald test25

Interpretations based on Model 3H1 randomization to group increases abstinence atpost-intervention (β̂3 .869, p .001) this benefit goes away across time (β̂7 .509, p .031,β̂10 .087, p .096)Estimated odds ratio at post-interventionOR exp[4/3(.869)] 3.19(multiply by 4/3 because this equals the difference between thecontrol and treatment groups in the coding of the H1 contrast)Asymptotic 95% confidence interval for this odds ratioexp[4/3(.869) 1.96 4/3(.226)] (1.76, 5.75)26

H2 going to groups increases abstinence at post-intervention(β̂4 .435, p .004) this benefit goes away across time (β̂8 .389, p .037,β̂11 .086, p .044)Estimated odds ratio at post-interventionOR exp[3/2(.435)] 1.92(multiply by 3/2 because this equals the difference between thosenot attending and those attending groups in the coding of the H2contrast)Asymptotic 95% confidence interval for this odds ratioexp[3/2(.435) 1.96 3/2(.151)] (1.23, 2.99)27

H3 marginally significant benefit of enhanced groups atpost-intervention (β̂5 .274, p .066) this does not significantly vary across time(β̂9 .051, p .80, β̂12 .0003, p .95)Estimated odds ratio at post-interventionOR exp[2(.274)] 1.73(multiply by 2 because this equals the difference between theenhanced and regular groups in the coding of the H3 contrast)Asymptotic 95% confidence interval for this odds ratioexp[2(.274) 1.96 2(.149)] (.96, 3.10)28

Determination of group difference at any timepointModel 3ηij β0 β1Tj β2Tj2 β3H1i β4H2i β5H3i β6Racei β7(H1i Tj ) β8(H2i Tj ) β9(H3i Tj ) β10(H1i Tj2) β11 (H2i Tj2) β12(H3i Tj2)ˆ β̂3 (T β̂7) (T 2 β̂10)H1e.g., T 4,ˆ .869 (4 .509) (16 .087) .227H1is this a signficant difference?29

H0 : β3 (4 β7) (16 β10) 0 Wald test for linear combination of parameters"c 0 0 0 1 0 0 0 4 0 0 16 0 0#X12 .90 for this H1 contrast at the final timepointSimilarly, X12 1.79 and .17, respectively for H2 and H3contrasts at last timepoint No significant group differences by the end of the study30

Model 3 - Estimated Abstinence RatesEnd-of61224Program months months monthsGroup(T1)(T2)(T3)(T4)No Contact Control.146.137.140.186No ial Support.456.266.192.2601obtained as group by time averages of p̂ij 1 exp( η̂ij )whereη̂ij β̂0 β̂1Tj β̂2Tj2 β̂3H1i β̂4H2i β̂5H3i β̂6Racei β̂7(H1i Tj ) β̂8(H2i Tj ) β̂9 (H3i Tj ) β̂10 (H1i Tj2) β̂11 (H2i Tj2 ) β̂12(H3i Tj2 )31

Figure 8.1 Observed point prevalence abstinence rates and estimatedprobabilities of abstinence across time32

Example 8.2: PROC GENMOD code and output fromanalysis of Robin Mermelstein’s smoking cessation studydataset. This handout illustrates GEE modeling of adichotomous outcome. Includes CONTRAST statements toperform linear combination and multi-parameter Wald tests,and OBSTATS to yield estimated probabilities for eachobservation (SAS code and ingeb Ctime.txt33

Another GEE example and comparisons with MRMchapters 8 and 9Consider the Reisby data and the question of drug plasma levelsand clinical response to depression; defineResponse 0 (HDRS 15) or 1 (HDRS 15)DM I 0 (ln dmi below median) or 1 (ln dmi above median)ResponseDM I 010 73521 4382 OR 2.6834

Reisby data - analysis of dichotomized HDRS1. Logistic regression (inappropriate model; for comparison) P (Respij 1) βlog0 β1 DM Iij 1 P (Respij 1) 2. GEE logistic regression with exchangeable structure P (Respij 1) βlog 0 β1 DM Iij 1 P (Respij 1)3. Random-intercepts logistic regression P (Respij 1) βlog0 β1 DM Iij συ θi 1 P (Respij 1) i 1, . . . , 66 subjects; j 1, . . . , ni observations per subject (max ni 4)35

Logistic Regression of dichotomized HDRS - ML ests (std errors)model termintercept β0ordinary LR-.339(.182).712GEE exchange-.397(.231).672Random Int-.661(.407).516DM 986.31exp(β0)2.004(.415).55subject sd συICC2 log L330.66293.8536

Marginal Models for Longitudinal Data Regression of response on x is modeled separately fromwithin-subject correlation Model the marginal expectation: E(yij ) f n(x) Marginal expectation average response over thesub-population that shares a commone value of x Marginal expectation is what is modeled in a cross-sectionalstudy37

Assumptions of Marginal Model for LongitudinalData1. Marginal expectation of the response E(yij ) µij dependson xij through link function g(µij )e.g., logit link for binary responses2. Marginal variance depends on marginal mean:V (yij ) V (µij )φ, with V as a known variance function (e.g.,µij (1 µij ) for binary) and φ is a scale parameter3. Correlation between yij and yij 0 is a function of the marginalmeans and/or parameters α Marginal regression coefficients have the same interpretationas coefficients from a cross-sectional analysis38

Logistic GEE as marginal model - Reisby example1. Marginal expectation specification: logit link µij P (Respij 1) log βlog 0 β1DM Iij 1 µij1 P (Respij 1)2. Variance specification for binary data: V (yij ) µij (1 µij )and φ 1 (in usual case)3. Correlation between yij and yij 0 is exchangeable, AR(1),m-dependent, UN39

exp β0 ratio of the frequencies of response to non-response(i.e., odds of response) among the sub-population (ofobservations) with below average DMI exp β1 odds of response among above average DMIobservations divided by the odds among below average DMIobservationsexp β1 ratio of population frequencies “population-averaged”40

Random-intercepts logistic regression Pr(Yij 1 θi) 0 β σ θ xlogυ iij1 Pr(Yij 1 θi) org[Pr(Yij 1 θi)] x0ij β συ θiwhich yieldsPr(Yij 1 θi) g 1[x0ij β συ θi]where g is the logit link function and g 1 is its inverse function(i.e., logistic cdf)41

Taking the expectation,soE(Yij θi) g 1[x0ij β συ θi]Zµij E(Yij ) E[E(Yij θi)] θ g 1[x0ij β συ θi]f (θ) dθWhen g is a nonlinear function, like logit, and if we assume thatg(µij ) x0ij β συ θiit is usually not true thatg(µij ) x0ij βunless θi 0 for all i subjects, or g is the identity link (i.e., thenormal regression model for y) same reason why the log of the mean of a series of values doesnot, in general, equal the mean of the log of those values (i.e.,the log is a nonlinear function)42

Random-intercepts Model - Reisby example every subject has their own propensity for response (θi) the effect of DM I is the same for every subject (β1) covariance among the repeated obs is explicity modeled β0 log odds of response for a typical subject withDM I 0 and θi 0 β1 log odds ratio of response when a subject is high onDM I relative to when that same subject is not– On average, how a subject’s resp prob depends on DM I– Strictly speaking, it’s not really the “same subject,” but“subjects with the same value of θi” συ represents the degree of heterogeneity across subjects inthe probability of response, not attributable to DM I43

Most useful when the objective is to make inference aboutsubjects rather than the populat

Generalized Estimating Equations, New York: Chapman and Hall. Hu F.B., Goldberg J., Hedeker D., Flay B.R., Pentz M.A. (1998). A comparison of generalized estimating equation and random-effects approaches to analyzing binary outcomes from longitudinal studies: illustrations from a smoking prevention study,

Related Documents:

Lecture 12 Nicholas Christian BIOST 2094 Spring 2011. GEE Mixed Models Frailty Models Outline 1.GEE Models 2.Mixed Models 3.Frailty Models 2 of 20. GEE Mixed Models Frailty Models Generalized Estimating Equations Population-average or marginal model, provides a regression approach for . Frailty models a

Electrical Construction Estimating Introduction to Electrical Construction Estimating Estimating activites will use the North State Electric estimating procedures. Estimating and the Estimator Estimating is the science and the art by which a person or organization determines in advance of t

Information-Centric Networking From Point-to-Point Communication To Content Distribution Liang Wang liang.wang@cl.cam.ac.uk. Content Motivation & Key Components Naming Schemes Routing & Mobility In-Network Caching Well-Known Designs Service-Centric Networking.

EQUATIONS AND INEQUALITIES Golden Rule of Equations: "What you do to one side, you do to the other side too" Linear Equations Quadratic Equations Simultaneous Linear Equations Word Problems Literal Equations Linear Inequalities 1 LINEAR EQUATIONS E.g. Solve the following equations: (a) (b) 2s 3 11 4 2 8 2 11 3 s

The GENMOD procedure enables you to perform GEE analysis by specifying a REPEATED statement in which you provide clustering information and a working correlation matrix. The generalized linear model estimates are used as the starting values. Both model-based and empirical standard errors of the parameter estimates are produced.

1.2 First Order Equations 5 1.3 Direction Fields for First Order Equations 14 Chapter 2 First Order Equations 2.1 Linear First Order Equations 27 2.2 Separable Equations 39 2.3 Existence and Uniqueness of Solutions of Nonlinear Equations 48 2.5 Exact Equations 55 2.6 Integrating Factors 63 Chapter 3 Numerical Methods 3.1 Euler’s Method 74

Chapter 1 Introduction 1 1.1 ApplicationsLeading to Differential Equations 1.2 First Order Equations 5 1.3 Direction Fields for First Order Equations 16 Chapter 2 First Order Equations 30 2.1 Linear First Order Equations 30 2.2 Separable Equations 45 2.3 Existence and Uniqueness of Solutionsof Nonlinear Equations 55

Introduction to Quantum Field Theory John Cardy Michaelmas Term 2010 { Version 13/9/10 Abstract These notes are intendedtosupplementthe lecturecourse ‘Introduction toQuan-tum Field Theory’ and are not intended for wider distribution. Any errors or obvious omissions should be communicated to me at j.cardy1@physics.ox.ac.uk. Contents 1 A Brief History of Quantum Field Theory 2 2 The Feynman .