Research Reproducibility

2y ago
8 Views
3 Downloads
855.20 KB
19 Pages
Last View : 12d ago
Last Download : 3m ago
Upload by : Xander Jaffe
Transcription

Research Reproducibilityin Computational Social ScienceAek Palakorn Achananuparp, SMUResearch Integrity Conference 2018, Singapore

INTRODUCTION & DEFINITIONS

COMPUTATIONALSOCIAL SCIENCE(CSS)First coined by Lazer et al. (2009) in theNature articleModeling human activity, behavior, andrelationships through the use ofcomputational methods and large-scaledata (thousands to billions of data points)Image source: Designed by Itakod / Freepik

DATA SOURCES“DIGITAL TRACES”COMMON STUDY TOPICS Predicting friendships in social networksModeling information diffusion processPredicting electoral outcomesModeling human activity in offline settingsRecommending books, papers, articles,movies, songs, etc.

WHAT DOES REPRODUCIBILITY MEAN?CONCEPTTEAMEXPERIMENT meReproducibilityDifferentDifferentSource: ACM

NON-COMPUTATIONAL V.S. COMPUTATIONAL RESEARCHIn non-computational research:In computational research:Replicability reproducibility different groups can obtain thesame result independently byfollowing the original study’smethodology.Replicability different groups canobtain the same result using theoriginal study's artifacts (datasets,code, and workflows).Reproducibility different groups canobtain the same result usingindependently developed artifacts.

COMPUTATIONAL REPRODUCIBILITYWe’ll mostly focus on replication and reproductionof computational research, i.e., computationalreproducibility, in CSS.

REPRODUCIBILITY CRISIS IN CSS?

REPRODUCIBILITY CRISIS IN CSS For electoral prediction studies using Twitter data, an independent group was notable to reproduce their positive results (Gayo-Avello et al. 2011). 61% of 21 social science studies published in Nature and Science can be reproduced(Camerer et al. 2018). For 54% of 601 studies published at major computational research conferences, anindependent group was able to build the code or the authors stated the code wouldbuild with some effort (Collberg et al. 2014). Out of 400 artificial intelligence papers, 6% provide code for the papers’ algorithm,30% provide test data, 54% provide pseudocode (Hutson, 2018).

REPRODUCIBILITY CHALLENGES IN CSS

TECHNOLOGICAL IRREPRODUCIBILITY Some code and dataset require high-performance or esotericsystems to run. Different tools, platforms, & versions may produce different results. Some software dependencies are no longer available. Is it still possible to run the original artifacts a few years later?

DATA PRIVACY & LEGAL LIMITATIONS Data privacy is going to be more critical than before after theCambridge Analytica fiasco. More difficulty in collecting and sharing online social media data. Data ownership is not always clear-cut. Intellectual property prevents code sharing.

EXPERIMENTAL IRREPRODUCIBILITY Complex social systems are extremely difficult to study. States of the world are irrevocably not the same today compared tothe time when the original experiments were conducted. Some external influences, e.g., media exposure, are almostimpossible to control.

ENABLING REPRODUCIBLE RESEARCH

ENABLING REPRODUCIBLE RESEARCHOpen Research/Data Platforms Open Science FrameworkCodaLabReScienceJupyter Notebooks

ENABLING REPRODUCIBLE RESEARCHOpen Data Repositories Microsoft Research Open DataStanford Network Analysis Project(SNAP)UCI Machine Learning RepositoryGroupLensLARC Data Repository

LARC Data Repository

SAGAN STANDARD, UPDATED“Extraordinary claims require extraordinary evidenceand extraordinary transparency.”Aek Palakorn Achananuparppalakorna@smu.edu.sg@aekpalakorn

REFERENCES Artifact Review and Badging, ACM. -review-badging.Butler, D. (2013) When Google got flu wrong. NatureCamerer et al. (2018) Evaluating the replicability of social science experiments in Nature and Science between2010 and 2015. Nature Human Behavior 2.Collberg et al. (2014) Measuring Reproducibility in Computer Systems Research. University of Arizona TechnicalReport 14-04.Gayo-Avello et al. (2011) Limits of Electoral Predictions Using Twitter. In Proc. of ICWSM ‘11.Goodman et al. (2016) What does research reproducibility mean? Science Translational Medicine.Hutson, M. (2018) Missing data hinder replication of artificial intelligence studies. ce-studiesLazer et al. (2014) The Parable of Google Flu: Traps in Big Data Analysis. Science.Pentland, A. (2012) Big Data’s Biggest Obstacles. Harvard Business Review.Reproducibility in Machine Learning Workshop, ICML bility-workshop/homeStodden, V. (2013) Resolving Irreproducibility in Empirical and Computational Research. IMS Bulletin Online.Stodden et al. (2016) Enhancing reproducibility for computational methods. Science, 354(6317).

Replicability reproducibility different groups can obtain the same result independently by following the original study’s methodology. . Camerer et al. (2018) Evaluating the replicability of social science experiments in Nature and Science between 2010 and 2015. Nature Human Behavior 2. Collberg et

Related Documents:

Re-Thinking Reproducibility as a Criterion for Research Quality Sabina Leonelli . science and a good proxy measure for the quality and reliability of research results. Reproducibility comes in a variety of forms geared to different methods .

Replicability, reproducibility, repeatability. Are these terms equivalent? A definition: – “Independently running a research experiment and yielding the same results on each iteration” Reproducibility is the essence of

The fiery debate over reproducibility in science has burned strong over the past several years, and the flames don’t show any signs of dying down just yet. No matter how scientists view reproducibility in their respective fields

NASEM Consensus Study Report on Reproducibility and Replicability in Science, 2019; Christinsen, Freese, Miguel. Transparent and Reproducible Social Science Research, 2019 “Concerns about reproducibility and replicability have been expressed in both scien

Reproducibility and Replicability in Science or the National Academies of Sciences, Engineer-ing, and Medicine. Reproducibility and Replicability in Science, A Metrology Perspective A Report to the Nat

Reproducibility – principles and challenges . reproducibility of psychological science. Science, 349(6251) Diagnosis: Mainly a problem in life, medical . noise chasing and thus a guarantor of replicability

Reproducibility Project 2 Reproducibility is a core principle of scientific progress (1-6).Scientific claims should not gain credence because of the status or authority of their

nationality, culture, or gender! Disinterestedness: act for the benefit of a common scientific enterprise, rather than for personal gain.! . Open Science Framework / Reproducibility Project in Psychology! AMP / ICIAM 2011 “Community Forum on Reproducible Research Policies”!