Reproducibility Replicability: P-values And The Larger .

2y ago
12 Views
2 Downloads
509.47 KB
9 Pages
Last View : 2m ago
Last Download : 3m ago
Upload by : Abby Duckworth
Transcription

Reproducibility — Replicability:P-values and the Larger QuestionsAndreas BujaDepartment of Statistics, The Wharton SchoolUniversity of PennsylvaniaPhiladelphia, USAWith the PoSI group:Richard Berk, Larry Brown, Linda Zhao, Kai Zhang, Ed GeorgeNRC 2015/02/26-27

P-ValuesBoos & Stefanski’s contribution:Raising awareness of sampling variability in p-values.Showing that it can be quantified.Fundamental question: Seeing a p-value, do we believe that underreplication something close to it would appear again and again?(see Steve Goodman citing Fisher)Use more stringent cut-offs than 0.05 to achieve replicable 0.05.Basic pedagogical problems:P-values are random variables!They smell like probabilities but are transformed/inverted test statistics.The sense of “random variable”: “sampling variability” a tragically belittling term for a deep concept!“Sampling variability” dataset-to-dataset variability possible-worlds variabilityAndreas Buja (Wharton, UPenn)Reproducibility — Replicability: P-values and the Larger Questions2015/02/26-272/1

P-Values: From Variability to BiasSource of Bias: Standard Error SE(tie to Benjamini and G L)Generic SE2 :Xi standardized Assumption:Xi uncorrelatedV[X̄ ] 1/nConsider exchangeable dependence: Corr[Xi , Xj ] ρ 0 Example: ρ 0.01 V[X̄ ] (1 ρ)/n ρ ρ 0SE ρ 0.1,never mind n.Random effects model for research studies: Xstudy ,i αstudy study ,i Corr[Xstudy ,i , Xstudy ,j ] σα2 /(σα2 σ 2 ) ρ Exchangeable intra-study correlationMessage: Don’t ask for larger studies; ask for multiple studies.Andreas Buja (Wharton, UPenn)Reproducibility — Replicability: P-values and the Larger Questions2015/02/26-273/1

Statistical and Economic Thinking for ReplicabilityStatistical Thinking: Statistics “quantitative epistemology”Statistics the science that creates protocols for the acquisition ofqualified knowledge.IIAbsence of protocols is damaging.Important distinctions made today: replicability vs reproducibility;empirical, computational, statistical"Economic Thinking: Research “economic system”To solve the replicability problem, we must set incentives right.Points of attack:IIEconomic incentives: Journals and their policiesStatistical protocols: Researchers and their protocolsAndreas Buja (Wharton, UPenn)Reproducibility — Replicability: P-values and the Larger Questions2015/02/26-274/1

Two Types of Reform: (1) Economics JournalsJournals: Stop the chase of “breakthrough science”.Publish, solicit, and treat favorably:IIreplicated results,negative outcomes.Insidious:IIResearchers will self-censor if journals treat replicated results andnegative outcomes even slightly less favorably.Researchers lose interest as soon as negative outcomes are apparent.Ideal protocol: Journals should accept/reject NOT knowing outcomes(Young & Karr, Significance Mag. 2011). Accept/reject based on:IIImerit and interest of the research problem,study design,quality of researchers.Goal: No outcome-based deselection and a share of replication.Andreas Buja (Wharton, UPenn)Reproducibility — Replicability: P-values and the Larger Questions2015/02/26-275/1

Two Types of Reform: (2) Statistics ResearchersResearchers: Account for all data-analytic activity.Reveal all exploratory data analysis, in particular visualizations.Reveal all model searching (lasso, forward/backward/all-subsets,Bayesian, .; CV, AIC, BIC, RIC.)Reveal all model diagnostics and actions resulting from them.Attempt inference that accounts for all of the above.Principle: Any data-analytic action that could result in a different outcomein another dataset needs to be accounted for.Goal: “Whole-Data-Analysis inference”Andreas Buja (Wharton, UPenn)Reproducibility — Replicability: P-values and the Larger Questions2015/02/26-276/1

Some AttemptsPost-selection inference:IIDid you ever write a contract with yourself to try just one selection method?PoSI: Inference that is inferentially insured against all attempts at modelselection, including significance hunting (a form of p-hacking).Berk et al., “Valid Post-Selection Inference,” AoS, 2013Inference for data visualization: a beginningIIIPrinciple: Plot synthetic data and compare with the actual data.Sources of synthetic data: Permutations for independence tests,parametric bootstrap for model diagnostics,sampling conditional on sufficient statistics, .Line-up protocol: insert the actual plot among 19 synthetic plots 5% significanceBuja et al., “Statistical Inference for Exploratory Data Analysis and ModelDiagnostics,” Philosophical Transactions of the Royal Society A., 2009Andreas Buja (Wharton, UPenn)Reproducibility — Replicability: P-values and the Larger Questions2015/02/26-277/1

Line-Up: 5% significance if you find the actual data Housing Cost Climate RatingAndreas Buja (Wharton, UPenn)Reproducibility — Replicability: P-values and the Larger Questions2015/02/26-278/1

SummaryP-values: variability and biasInstitutional Reforms (1): Outcome-blind policies for journalsInstitutional Reforms (2): Whole-data-analysis protocols for researchersTo achieve replicability, replicate.THANKS!Andreas Buja (Wharton, UPenn)Reproducibility — Replicability: P-values and the Larger Questions2015/02/26-279/1

Andreas Buja (Wharton, UPenn) Reproducibility — Replicability: P-values and the Larger Questions 2015/02/26-27 4 / 1 Two Types of Reform: (1) Economics !Journals Journals : S

Related Documents:

Replicability is stronger than reproducibility Replicability introduces other variables like different researchers, equipment, . Replicability crisis in Science “The test of replicability, as it’s known, is the foundation of modern research. Replicabilit

Reproducibility and Replicability in Science or the National Academies of Sciences, Engineer-ing, and Medicine. Reproducibility and Replicability in Science, A Metrology Perspective A Report to the Nat

NASEM Consensus Study Report on Reproducibility and Replicability in Science, 2019; Christinsen, Freese, Miguel. Transparent and Reproducible Social Science Research, 2019 “Concerns about reproducibility and replicability have been expressed in both scien

transparency, reproducibility and replicability of several components of systematic reviews with meta-analysis of the effects of health, social, behavioural and educational interventions. Methods: The REPRISE (REProducibility and Replicability In

Reproducibility and replicability of research results have gained . [Open Science Collaboration et al. 2015] to artificial intelligence [Hutson 2018] over the lack of reproducibility, and one could wonder abou

Replicability reproducibility different groups can obtain the same result independently by following the original study’s methodology. . Camerer et al. (2018) Evaluating the replicability of social science experiments in Nature and Science between 2010 and 2015. Nature Human Behavior 2. Collberg et

terminology of repeatability, replicability, and reproducibility is coined as follows. While repeatability is limited to the reliable repetition of experiments with the same experimental setup conducted by th

Artificial Intelligence Use Cases in Local Government Artificial intelligence-driven systems are radically changing the world around us. What was once the domain of mathematicians and scientists is now readily accessible and consumable through open source technology, cloud-based managed services and low-code platforms. In local government, the meaningful applications of AI benefitting the .