Standards For Heterogeneity Of Treatment Effect (HTE)

2y ago
10 Views
2 Downloads
453.86 KB
75 Pages
Last View : 17d ago
Last Download : 3m ago
Upload by : Harley Spears
Transcription

Standards for Heterogeneity ofTreatment Effect (HTE)Ravi Varadhan, PhD (Biostat), PhD (Chem.Engg)Chenguang Wang, PhD (Statistics)Johns Hopkins UniversityJuly 21, 2015

Module 1Introduction

ObjectivesÉTo be able to appreciate the importance of HTEassessmentÉTo understand the challenges of HTE assessmentÉTo learn the statistical concepts pertaining to HTEassessmentÉTo learn about the various types of inferentialtechniques for HTE assessmentÉTo learn a software tool for conducting HTEassessment using a case study exampleÉTo understand and apply the PCORI methodologystandards for HTE assessment

PCORI Standards for HTE1HT-1: State the goals of HTE analyses2HT-2: For all HTE analyses, Pre-Specify the analysesplan; for hypotheses driven HTE analyses,pre-specify hypotheses and supporting evidencebase3HT-3: All HTE claims must be based on appropriatestatistical contrasts among comparison groups,such as interaction tests or estimates of differencesin treatment effect4HT-4: For any HTE analysis, report all pre-specifiedanalyses and at minimum, the number of post-hocanalyses, including all subgroups and outcomesanalyzed

Importance of the StandardsÉUnderstanding HTE is a core aspect of PCOR, andthe standards are aimed at enhancing the reliabilityof HTE findings from PCOR studiesÉThe standards are broadly applicableÉAll study types: experimental/observational studies,explanatory/pragmatic trials, claims/EHRdatabases, etc.ÉShould be considered during all 3 stages of a study:design, analysis, and reporting stages

What is HTE?ÉAt-risk individuals are heterogeneous (age, sex,disease etiology and severity, comorbidities,coexposures, genetic variants)ÉOften, individuals respond differently to the sametreatmentÉTreatment response variation explainablevariation random fluctuationÉHTE is the explainable variation in treatmentresponse that is attributable to individualcharacteristics

ITE and HTEÉTwo treatments: Zi 0 and Zi 1, i 1, . . . , N.ÉTwo potential responses: {Yi (Z 0), Yi (Z 1)}.ÉIndividual treatment effect (ITE) is a comparison ofthe two potential responses: e.g., θi Yi (1) Yi (0).ÉA necessary condition for HTE to be present:var(θ) 0.ÉThe condition that var(θ) 0, is known asunit-treatment additivity Yi (1) Yi (0) τ, i(Cox 1984)

Fundamental Problem of HTEÉHowever, we can only observe either Yi (Z 0), orYi (Z 1), but not both (except in very rare cases)ÉHence, ITE is not identifiableÉWe can consider groups of similar individuals andestimate a group-specific treatment effect (GTE) foreach groupÉThis is subgroup analyis (to be discussed in anothermodule)

Reading List1 Berry DA (1990). Subgroup analyses. Biometrics. 46: 1227-1230.2 Cox DR (1984). Interaction. Intl. Statist. Rev. 52: 1-24.3 Gabriel SE and Normand SLT (2012). Getting the methods right - thefoundation of patient-centered outcomes research. NEJM. 367:787-790.

Module 2Why is HTE important?

Individual VariabilityIf it were not for the great variability betweenthe individuals, medicine might as well be ascience, not an art." (William Osler, 1892)One man’s meat is another man’s poison

Average Treatment Effect (ATE)The paradox of the clinical trial is that it is thebest way to assess whether an interventionworks, but is arguably the worst way to assesswho benefits from it. (Mant, 1999)What’s good for the goose is good for thegander

Can We Do Better Than ATE?Answers from the clinical trial represent agroup reaction. They show that one group faredbetter than another, that given certaintreatment patients, on average, get bettermore frequently or more rapidly, or both. Wecannot necessarily, perhaps very rarely, passfrom that to stating exactly what effect thetreatment will have on a particular patient. Butthere is surely, no way and no method ofdeciding that. Also, it is well to remember thatobservation of the group does not inhibit themost scrupulous and careful observation of theindividual at the same time - if it is believedthat more can be learnt that way.Sir Austin Bradford Hill, 1952

HTE Assessment Is PerilousÉHTE assessment is important and dangerousÉNumerous examples of spurious or unreplicatedfindings (e.g., astrological signs in ISIS-2)Gemini or LibraOther signsOverallÉPercent reduction in 5-weekvascular mortality-9%(NS)28%(p 10 5 )23%(p 10 5 )Peter Rothwell’s monograph on “TreatingIndividuals” contains more examples

Why is HTE Assessment Perilous?ÉATE is a first-order contrast, i.e. comparisonbetween two treatment groupsÉHTE analysis involves second-order or higher-ordercontrasts, i.e. difference of differences (e.g.,difference in treatment effect between men andwomen)ÉHigher-order effects have larger variance due toreduced information content - or, equivalently, dueto reduced sample size in subgroupsÉIncreased false-positive (type-I) and false-negative(type-II) errors for identifying HTE

The Central Challenge of HTE AnalysisTo Identify which type of individuals are likely tobenefit from or be harmed by the treatment, whileminimizing the possibility of spurious inferencesdue to biases and chance associations.

Reading List1 Berry DA (1990). Subgroup analyses. Biometrics. 46: 1227-1230.2 Fleming TR (2010). Clinical trials: discerning hype from substance.Ann Intern Med. 153: 400-406.3 Jones HE et al. (2011). Bayesian models for subgroup analysis inclinical trials. Clin Trials. 8: 129-143.4 Rothwell PM (2007). The Lancet: Treating Individuals: fromrandomised trials to personalised medicine. Elsevier.5 Varadhan R et al. (2013). A framework for the analysis ofheterogeneity of treatment effect in patient-centered outcomesresearch. J Clin. Epidem. 66: 818-825.

Module 3Subgroup Analysis

Subgroup AnalysisÉWe focus on the setting of a randomized clinicaltrial of treatment Z for response Y with Kcategorical baseline variables or factors, X1 , . . . , XKÉAlso applies to non-experimental settings, butpossibly with some additional considerations suchas confounded treatment assignmentÉIt is supposed a priori that any of these factorscould be a source of HTE (selection of these factorsis a critical issue, which will be discussed in anothermodule)ÉTwo types of subgroups: univariate and multivariatesubgroups

Univariate Subgroup AnalysisÉEach subgroup is defined on the basis of a singlebaseline variableÉFor example: men and women;smoker/non-smoker; normal/overweight/obeseÉHTE is examined separately for each subgroupingvariableÉNote that the subgroups are not mutually exclusive

Subgroup-Specific Treatment Effects

LimitationsÉComparisons are indirect & stratification reduces power

One-by-one interaction testing (OBO)g(E[Y X]) α01 α11 Z β1 X1 γ1 X1 Z.g(E[Y X]) α0K α1K Z βK XK γK XK ZRemarksÉA proper way to test for HTEÉXk is assumed to be binaryÉBonferroni correction for type I error inflationÉIncreases type I error without adjustment for multiplictyÉAdjustment for multiplicity decreases power K

Univariate Subgroup AnalysisÉMost popular approach - reported in most Phase 3clinical trialsÉEvaluate whether the treatment is consistentacross univariate subgroupsÉThe subgroups are not mutually exclusive,therefore, subgroup-specific treatment effects arecorrelatedÉSubgroups are also more similar to each other sincethey differ in only one characteristic - less likely todetect HTE

Multivariate Subgroup AnalysisÉEach subgroup is defined on the basis of multiplebaseline variablesÉFor example: men-smoker, women-smoker,men-non-smoker, and women-non-smokerÉMutually exclusive subgroups, therefore, the rawsubgroup effects are independent

Multivariate Subgroup AnalysisOur interest lies in understanding how the effect oftreatment Z on Y varies jointly according to baselinecharacteristics.We can define the treatment effect within any givenstratum x {x1 , · · · , xK } as follows:θ(x) g(E[Y Z 1, X x]) g(E[Y Z 0, X x]),where E[.] denotes the expectation (mean) of Y, andg(.) is a link function denoting the scale in which thetreatment effect is quantified.

Multivariate Subgroup AnalysisÉCommonly used treatment effect scales are: identity link: g(x) x log link: g(x) log(x) logit link: g(x) log(x1 x).ÉCurse of dimensionality severely limits the numberof characteristicsÉSome type of modeling becomes necessary toshare information across subgroups (we will discussBayesian approaches in another module)

Reading List1 Berry DA (1990). Subgroup analyses. Biometrics. 46: 1227-1230.2 Fleming TR (2010). Clinical trials: discerning hype from substance.Ann Intern Med. 153: 400-406.3 Jones HE et al. (2011). Bayesian models for subgroup analysis inclinical trials. Clin Trials. 8: 129-143.4 Rothwell PM (2007). The Lancet: Treating Individuals: fromrandomised trials to personalised medicine. Elsevier.5 Varadhan R et al. (2013). A framework for the analysis ofheterogeneity of treatment effect in patient-centered outcomesresearch. J Clin. Epidem. 66: 818-825.

Module 4Regression Models forHTE Analysis

Interaction and Effect-ModificationÉTwo types of baseline factors X (DR Cox, ISR 1984)ÉNo difference statistically, but importantinterpretationallyÉWhen X is another treatment or a modifiable factor(e.g., smoking), we say that X interacts withtreatment ZÉWhen X is an intrinsic variable or a fixed attribute(e.g., gender), we say that X modifies the effect oftreatment Z

Risk-Based HTE AnalysisÉIn conventional analysis, each covariate might havesmall effect (weak analyses w/ multiplicityconcerns)ÉCovariates might combine in some fashion to revealHTE (joint effects)ÉPrimary outcome risk is a potentially strong HTEpredictor (Kent 2010)ÉOutcome risk, P(Y 1 Z 0), is mathematicallyrelated to the treatment effect, for example:P(Y 1 Z 0) P(Y 1 Z 1)RRR 1 RR P(Y 1 Z 0)

Risk-Based HTE AnalysisÉOutcome risk is computed for each individualÉOutcome risk distribution is reported separately forthe two treatment armsÉTreatment effect (relative risk or absolute riskreduction) is estimated and reported within strataof outcome riskÉFormal interaction test of outcome risk withtreatment is not emphasized

Risk-Based HTE AnalysisÉHow to estimate the outcome risk for eachindividual?ÉPre-existing, validated prognostic scores (e.g.,Framingham risk score, CHADS2 , Karnofskyperformance scale, FRAX, frailty index)ÉWhat if validated prognostic models are notavailable (or if the model does not calibrate well tothe present study)?ÉWe may use the control arm to develop aprognostic index, but this underestimatesuncertainty and HTE estimation could be biased

Unstructured interaction model (UIM)g(E[Y]) α0 α1 Z β1 X1 · · · βK XK γ1 X1 Z · · · γK XK ZCommentsÉAll possible two-way Z*X interactionsÉModel has 2K 2 parametersÉSingle test of any possible interactionsÉPower decreases with increasing K

Proportional interactions model (PIM)g(E[Y]) α0 (1 Z) α1 Z β0 X(1 Z) θ(β0 X) ZÉÉA single parameter represents HTEIn UIM, the γ interaction coeffiicients (γ0 X) Z cantake any valueÉPIM is a special case of UIM: γ θβ, θ a scalarÉMain assumption: prognostic effects for treatedsubjects are proportional to those of controlsubjectsÉSimilar in spirit to risk-based HTE approach, butmore powerful and broadly applicable

Proportional interactions model (PIM)g(E[Y]) α0 (1 Z) α1 Z β0 X(1 Z) θ(β0 X) ZÉUnder PIM: H0 : θ 1 to H1 : θ 6 1ÉRejecting H0 is evidence that proportionalinteraction appliesÉNot rejecting H0 does not necessarily implyabsence of HTEÉMore challenging to interpret HTE findings

Reading List1 Cox DR (1984). Interaction. Intl. Statist. Rev. 52: 1-24.2 Gail M and Simon R (1985). Testing for Qualitative InteractionsBetween Treatment Effects and Patient Subsets. Biometrics. 41:361-372.3 Kent DA, et al. (2010). Assessing and reporting heterogeneity intreatment effects in clinical trials: a proposal. Trials. 11: 85.4 Kovalchik S et al. (2013). Assessing the heterogeneity of treatmenteffect in a clinical trial with the proportional interactions model. Stats.Med. 32: 4906-4923 .5 Rothwell PM (2007). The Lancet: Treating Individuals: fromrandomised trials to personalised medicine. Elsevier.6 vanderWeele TJ and Knol MJ (2014). A tutorial on interactions.Epidemiol. Methods. 3: 33-72.

Module 5Bayesian Models for HTEAnalysis

What is a Bayesian HTE Analysis?ÉIn Bayesian analysis, all unknown parameters (e.g.,main effects, interaction effects) are treated asrandom variablesÉObserved data {X, Z, Y} is considered to be fixedÉWhereas in the more popular frequentistapproaches, the unknown parameters are fixed,and the observed data is a particular instance of arandom processÉBayes methods require specification of prior beliefsfor the unknown parameters, in the forms of priorprobability distributions

What is a Bayesian HTE Analysis?ÉLike frequentist methods, they require a statisticalmodel for data generating processÉUsing the Bayes rule, the prior distributions anddata generating model are combined to produceupdated, posterior distributions for the unknownparametersÉPowerful computational tools are often used (e.g.,Markov Chain Monte Carlo techniques) to generatesamples of unknown parameters from the posteriordistributionÉSummaries of the posterior distribution comprisethe main results of a Bayesian analysis

Why Consider Bayesian HTE Analysis?ÉThe phrase ‘heterogeneity of treatment effects’implies an underlying distribution of treatmenteffectsÉThus a Bayesian framework is naturalÉA Bayesian analysis does not emphasize whether astatistical procedure for detecting HTE is significantor not - an arbitrarily dichotomous decisionÉA Bayesian approach emphasizes estimation of themagnitude of HTE

Why Consider Bayesian HTE Analysis?ÉBayesian approach can exploit prior knowledge toincrease the precision of subgroup-specific effectsÉTypically, model-based Bayesian estimates willhave lesser uncertainty than estimates fromseparate analyses of subgroups (e.g., rawsubgroup-specific effects)ÉBy sharing information across subgroups, accordingto the model, the Bayesian approach will stabilizethe raw estimate by pulling it back (shrinkage)towards the overall treatment effectÉThis produces estimates with lower mean-squarederror.

Why Consider Bayesian HTE Analysis?ÉBayes approach supports simple and directprobability statements about subgroup-specificeffectsÉFor example, we can ask: ‘what is the probabilitythat treatment A is better than treatment B forwomen?’ Or ‘Do men benfit more from treatmentthan women?’ÉSuch summaries can be readily understood bypatients and other stakeholdersÉFrequentist approach, however, only permitsstatements about the likelihood of observed data,under the hypothesized value of the effects

Simple Bayesian Models

Simple Shrinkage

Simple ShrinkagehiPrior : θ1 , . . . , θK μ, ωhiData : Yk θk , σk2hiPosterior : θk μ, ω, Yk N(μ, ω2 ) N(θk , σk2 ) N(., .)ω2Posterior mean of θk μ σk2 ω2(Yk μ)Hyper-priors for μ, ω, and σk2 ( σ 2 / nk )μ N(0, BIG)ω N1/ 2 , Gamma, IGlog(σ) Unif ( a, a)

Simple Shrinkage.

RegressionhiPrior : θk X, β, ωhiData : Yk θk , σk2hiPosterior : θk μ, ω, Yk N(β0 Xk β, ω2 ) N(θk , σk2 ) N(., .)Xk are binary/categoricalHyper-priors for β, ω, and σk2 ( σ 2 / nk )β0 , β N(0, BIG)ω N1/ 2 , Gamma, IGlog(σ) Unif ( a, a)

These models and a number of otherBayesian regression models can be fitusing BEANZ software

Reading List1 Berry DA (1990). Subgroup analyses. Biometrics. 46: 1227-1230.2 Dixon DO and Simon R (1991). Bayesian subset analysis. Biometrics.47: 871-881.3 Jones HE et al. (2011). Bayesian models for subgroup analysis inclinical trials. Clin Trials. 8: 129-143.4 Spiegelhalter DJ et al. (2004). Bayesian Approaches to Clinical Trialsand Health-Care Evaluation. Wiley.

Module 6Treatment Effect Scaleand QualitativeInteractions

HTE Depends on Effect ScaleTable. Probability of an adverse eventX 0X 1Z 00.100.20Z 10.040.08ÉZ is the treatment and X 0 and X 1 are thesubgroups.ÉRelative risk reduction is same 60% for bothsubgroups - no HTEÉBut, absolute risk reduction is greater for X 1(0.12 versus 0.06) - substantial HTE

HTE Depends on Effect ScaleÉPotential responses: {Yi (Z 0), Yi (Z 1)}.ÉIndividual treatment effect is a contrast b/wpotential responses: e.g., θi Yi (1) Yi (0), orθi Yi (1)/ Yi (0).ÉHTE present iff: θi 6 θj , for some i, j.ÉHomogeneity on relative scale does not implyhomogeneity on difference scale - this wouldhappen when Yi (0) varies

Removable InteractionÉInteraction is removable if a transformation of Yinduces additivityÉFor continuous Y, a nonlinear variance-stabilizingtransformation, h(Y), can be found (Cox, ISR 1984)(interpretation is important!)ÉRemovable interactions commonly known asquantitative interactionsÉMore difficult for binary YÉQualitative interactions are scale invariant “essential interactions”

On which scale should HTE be evaluated?ÉThe scale in which the model best fits the data?!ÉThe scale that provides a parsimonious model, i.e.without interaction terms?!ÉThe traditional scale (e.g., logit)?!ÉIn any case, the results should also be reported interms of absolute difference on the basic scale, i.e.difference in means or risk differenceÉSince the absolute difference scale is easy tointerpret and also is more pertinent for policy ortreatment decision purposes

Quantitative and Qualitative InteractionsÉquantiative HTE presence of any HTEÉqualitative HTE benefit in some and harm insome subgroups

Gail-Simon Test for Qualitative InteractionÉÉÉTrue treatment effects in K mutually-exclusivesubgroups: {δ1 , · · · , δK }Positive orthant, O { : δi 0, i}; negativeorthant, O { : δi 0, i}Null H0 : O O

Qualitative Interaction: Gail-Simon Test

Gail-Simon Test for Qualitative InteractionÉQ Q XkXI(Dk 0)(D2k / s2k ) c ;I(Dk 0)(D2k / s2k ) c,kwhere Dk is the treatment effect in subgroup k ands2k is its variance.ÉÉLikelihood ratio test: min(Q , Q ) cThe critical value c is determined so as to get thedesired probability of rejection under H0

Gail-Simon Test: An IllustrationNSABP Data: Treatment effect measured as difference in disease-freeproportion at 3 yearsAge 50PR 10PFProportion disease-freeat 3 yearsSEEffect, Dise, siD2i / s2i0.5990.054PFT0.4360.0570.1630.07884.28Age 50PR 10PF0.5260.051Age 50PR 10Age 50PR 90.0390.7900.039-0.1140.06892.72Gail Simon test statistics: Q 10.89, Q 4.28Critical value (at α .05) 5.43M Gail and R Simon (Biometrics, 1985)-0.0470.06140.59-0.1510.05477.58

Gail-Simon Test: An IllustrationRe-analysis with treatment effect measured as log hazard ratioEffect, Dise, siD2i / s2iAge 50PR 10Age 50PR 10Age 50PR 10Age 50PR 7240.20712.22Gail Simon test statistics: Q 14.38, Q 6.54Critical value (at α .05) 5.43M Gail and R Simon (Biometrics, 1985)

Reading List1 Cox DR (1984). Interaction. Intl. Statist. Rev. 52: 1-24.2 Gail M and Simon R (1985). Testing for Qualitative InteractionsBetween Treatment Effects and Patient Subsets. Biometrics. 41:361-372.3 vanderWeele TJ and Knol MJ (2014). A tutorial on interactions.Epidemiol. Methods. 3: 33-72.

Module 8HTE Analysis Plan andResults Reporting

Main Issues to Think AboutÉWhat is the goal of the HTE analyses?ÉHow will the HTE findings inform patient-centeredhealthcare decisions?ÉHow should I conduct HTE analyses?ÉHow should I report HTE results?

PCORI Standards for HTE1HT-1: State the goals of HTE analyses2HT-2: For all HTE analyses, Pre-Specify the analysesplan; for hypotheses driven HTE analyses,pre-specify hypotheses and supporting evidencebase3HT-3: All HTE claims must be based on appropriatestatistical

3 Gabriel SE and Normand SLT (2012). Getting the methods right - the foundation of patient-centered outcomes research. NEJM. 367: . The Central Challenge of HTE Analysis . (2013). A framework for the analysis of heter

Related Documents:

Bruksanvisning för bilstereo . Bruksanvisning for bilstereo . Instrukcja obsługi samochodowego odtwarzacza stereo . Operating Instructions for Car Stereo . 610-104 . SV . Bruksanvisning i original

10 tips och tricks för att lyckas med ert sap-projekt 20 SAPSANYTT 2/2015 De flesta projektledare känner säkert till Cobb’s paradox. Martin Cobb verkade som CIO för sekretariatet för Treasury Board of Canada 1995 då han ställde frågan

service i Norge och Finland drivs inom ramen för ett enskilt företag (NRK. 1 och Yleisradio), fin ns det i Sverige tre: Ett för tv (Sveriges Television , SVT ), ett för radio (Sveriges Radio , SR ) och ett för utbildnings program (Sveriges Utbildningsradio, UR, vilket till följd av sin begränsade storlek inte återfinns bland de 25 största

Hotell För hotell anges de tre klasserna A/B, C och D. Det betyder att den "normala" standarden C är acceptabel men att motiven för en högre standard är starka. Ljudklass C motsvarar de tidigare normkraven för hotell, ljudklass A/B motsvarar kraven för moderna hotell med hög standard och ljudklass D kan användas vid

LÄS NOGGRANT FÖLJANDE VILLKOR FÖR APPLE DEVELOPER PROGRAM LICENCE . Apple Developer Program License Agreement Syfte Du vill använda Apple-mjukvara (enligt definitionen nedan) för att utveckla en eller flera Applikationer (enligt definitionen nedan) för Apple-märkta produkter. . Applikationer som utvecklas för iOS-produkter, Apple .

och krav. Maskinerna skriver ut upp till fyra tum breda etiketter med direkt termoteknik och termotransferteknik och är lämpliga för en lång rad användningsområden på vertikala marknader. TD-seriens professionella etikettskrivare för . skrivbordet. Brothers nya avancerade 4-tums etikettskrivare för skrivbordet är effektiva och enkla att

Den kanadensiska språkvetaren Jim Cummins har visat i sin forskning från år 1979 att det kan ta 1 till 3 år för att lära sig ett vardagsspråk och mellan 5 till 7 år för att behärska ett akademiskt språk.4 Han införde två begrepp för att beskriva elevernas språkliga kompetens: BI

**Godkänd av MAN för upp till 120 000 km och Mercedes Benz, Volvo och Renault för upp till 100 000 km i enlighet med deras specifikationer. Faktiskt oljebyte beror på motortyp, körförhållanden, servicehistorik, OBD och bränslekvalitet. Se alltid tillverkarens instruktionsbok. Art.Nr. 159CAC Art.Nr. 159CAA Art.Nr. 159CAB Art.Nr. 217B1B