If You Had 100,000 People To Help You With Your Research .

1y ago
22 Views
3 Downloads
1.61 MB
25 Pages
Last View : Today
Last Download : 2m ago
Upload by : Julius Prosser
Transcription

If you had 100,000 peopleto help you with your research,what would you do?

What is Open Science?§§§§§§Open Access to articles and lab notebooksOpen DataOpen Source CodeOpen Collaboration (e.g., citizen science)Open Technology (e.g., Makers)Open FundingOECD (2015), “Making Open Science a Reality”, OECD Science,Technology and Industry Policy Papers, No. 25, OECD Publishing,Paris. http://dx.doi.org/10.1787/5jrs2f963zs1-en

Publishing Your Code Open Source:Science Code Manifesto§§§§§Code: All source code written specifically to process data for a publishedpaper must be available to the reviewers and readers of the paper.Copyright: The copyright ownership and license of any released sourcecode must be clearly stated.Citation: Researchers who use or adapt science source code in theirresearch must credit the code’s creators in resulting publications.Credit: Software contributions must be included in systems of scientificassessment, credit, and recognition.Curation: Source code must remain available, linked to related materials,for the useful lifetime of the publication. “Source: Sciencecodemanifesto.org

Example: Reproducible Research Workflowfor Analysis of Human Microbiome Data “After obtaining amplicon sequence, a standard series of bioinformaticand statistical analysis are used to evaluate the data: filtering out lowquality sequences and samples, constructing a taxonomic feature table ofobservations from each sample, incorporating the sample metadata,transforming and normalizing the feature table, and performingexploratory and inferential analyses. “ Through a bioinformatic forensics of Arumugam et al (2011), published inNature, authors estimated “more than 200 million possible ways ofanalyzing these data.” (p. 185) Methods used to filter sequences, construct taxonomic features, andperform analysis “are often performed in separate environments,making creation of a single coherent record of the analysis difficult andtime consuming.” (p. 186)

Example: Reproducible Research Workflowfor Analysis of Human Microbiome Data Researchers need to “adopt pipelines documenting choices used in theseanalysis with the intention of providing an assessment of the robustnessand reproducibility of the analysis.” (e.g., LaTex, Rmarkdown,JupyterNotebooks) Authors advocate platforms like GitHub to deposit data, associatedmaterials (e.g., complete metadata mapping files, taxonomy files,reference sequence files), and code. “Publish reproducible workflows encompassing the entirety of theanalysis” (e.g., unified R script and unified Rdata data object)Callahan et al. 2016. Reproducible Research Workflow in R for the Analysis ofPersonalized Human Microbiome Data, Pacific Symposium on Biocomputing2016: https://www.ncbi.nlm.nih.gov/pubmed/26776185

Accepting Open Source Contributions§§§§§§§§§§Shift from low-value work to to high-value workFacilitates rapid prototyping and experimentationAdvances interoperability between toolsLower total cost of ownershipDetect, diagnose, triage, and resolve bugs fasterIncreases reliability & security through peer reviewShift workflows from waterfall to agile and lock-freeEncourage more modular codeReduce duplication of effortAttract talented developersSources: Ben Balter, GitHub, 18F Open Source Policy ter/policy.md

Example: Google’s TensorFlowTensorFlow is a research-grade, open-sourcesoftware library for dataflow programmingacross a range of tasks, including machinelearning applications.§Google is exposed to emerging academic research through theopen source community.§Google engineers design the interface, test and validate codethat is introduced, and sync the internal and external versions.Beyond TensorFlow, Google uses a large number of open sourcelibraries in tis production code, increasing the economic impact.Cloud computing for big data and runnable instances of code!§§Source: David Konerding, Google Open Science team

TensorFlowCode of Conduct“In the interest of fostering an openand welcoming environment, we ascontributors and maintainers pledgeto making participation in ourproject and our community aharassment-free experience foreveryone, regardless of age, bodysize, disability, ethnicity, genderidentity and expression, level ofexperience, nationality, personalappearance, race, religion, or sexualidentity and w/tensorflow/blob/master/CODE OF CONDUCT.md“Projectmaintainers areresponsible forclarifyingstandards ofacceptablebehavior and areexpected to takeappropriate andfair correctiveaction” asnecessary.

What is Citizen Science?Contributions of the public to the advancementof scientific and engineering research andmonitoring in ways that may include:§§§§§§§Identifying research questionsDesigning/conducting investigationsDesigning/building/testing low cost sensorsCollecting and analyzing dataDeveloping data applicationsDeveloping technologies for scienceSolving complex problems

Samples/specimen y ways to involvevolunteers in scientificresearch, engineering, andmonitoring.Geo-locationMeasurementSample identificationData analysisData collectionDefining researchquestionsData processingProblemSolvingCodingTranscribing dataImage analysisAnnotate textData entryClassification or tagging

NASAROSES-2016A.47 CADETThe Citizen science Asteroid Data, Education,and Tools (CADET) is a joint solicitation of theNear Earth Objects (NEO) Program withinNASA’s Science Mission Directorate (SMD) andthe AGC program within NASA’s Office of theChief Technologist (OCT). It seeks innovativeproposals to adapt, develop, and web-enablesoftware tools for asteroid data analysis andto make them open source, web-accessibleand easily usable by non-professionals,including amateur astronomers, students, andcitizen scientists.

NASA ROSES-2016 A.47 CADETThe specific goals of the CADET program are to: Through agile development and other innovative methods, adapt, furtherdevelop and web-enable asteroid data analysis software to increase theproductivity of NEO and AGC research endeavors and extend the stateof-the practice in those endeavors; Develop easily usable and understandable software tools though theapplication of human-centered design best practices, including userresearch studies, systematic usability testing, and evaluation; Integrate advances in information technology with advances incyberlearning (i.e., what is known about how people learn withtechnology), and integrate these software tools into learningenvironments so their potential is fulfilled; Foster multi-disciplinary collaborations that span the NASA science,computer science, design, and education disciplines.

NASA ROSES-2016 A.47 CADET Astrometry: Solve an image for the positions of the stars and movingobjects within it. Allow batch processing of images. Photometry: Provide relative photometry for the moving objects withinan image using reference stars from standard catalogs. Allow for batchprocessing of images. Light Curve Analysis: Provide a virtual workbench for the user to createasteroid light curves from the derived photometric values and determinethe spin period of the asteroid. The software must account for light-time correction and the phaseangle effect on brightness. The software must produce an image of the folded light curve thatis suitable for publication in a scientific journal.

NASA ROSES-2016 A.47 CADET Because of the rapidly changing computing and computation technologyenvironment, awards resulting from this call are are limited to aperformance period of 24 months or less. Proposal must define clear, measureable milestones. The tools .are expected to be available for open use as web-basedapplications and as open source code. Plans for migration into a persistent software framework orinfrastructure for ongoing maintenance and user support should beidentified. Proposals should discuss plans for this, even if the project isnot expected to reach that level of maturity in the term of the award. Investigators are required to be archived the source code, and allrelevant documentation, at NASA’s Github site https://github.com/nasa

NASA ROSES-2016 A.47 CADET Agile: Proposals must include an agile software development andtesting plan to describe the software engineering practice to be used inthe project. Human Centered Design: Proposals also must address how they willdevelop and document user personas (e.g., high school and collegestudents, amateur astronomers, and professional astronomers), engageend users in the iterative testing and evaluation of the software tool,and how they will meet staffing expertise, as appropriate, includinguser experience (UX) design, user interface (UI) development, and webapplication development. The proposal will be considered unresponsivewithout this plan.

NASA ROSES-2016 A.47 CADETThe use of Apache License 2.0 is required. Any proposal responding to thissolicitation to adapt, develop, and web-enable the asteroid data analysissoftware is required to deliver to NASA a copy of such software with sufficientrights for use as Open Source Software under this Apache License 2.0.Therefore, each proposal shall: Identify any proprietary software, software owned by a non-Federal entity,or open source software that is incorporated into the software beingproposed; Indicate whether a license has been obtained in situations where proprietarysoftware, software owned by a non-Federal entity, or open source softwarehas been incorporated into the software that is the subject of the proposaland attach a copy of the license to the proposal, along with evidence ofpermission obtained from the software owner to release improvements orderivative works to the software as Open Source under the Apache License,Version 2.0. NASA will evaluate proposals for compliance with the aboveopen source software requirements. A proposal that does not include documentation sufficient to satisfy NASAthat the developed software will be open source may not be selected.

NSF Open Source LicensingGrant Requirements The NSF Future Internet Architecture-Next Phase (FIA-NP) programrequired open source licensing pursuant to the Open SourceInitiative: m. The NSF Secure and Trustworthy Cyberspace (SaTC) program states thatPIs who choose to open source their software should employ a licenselisted by the Open SourceInitiative: m. “Any software developed as part of this program is required to bereleased under an open source license listed by the Open SourceInitiative (http://www.opensource.org/).”

Open Source Management§§§§§§Update and expand NASA’s open source directoryUpload data & source code (e.g., NASA’s Github, Bitbucket)Manage and publish reproducible workflows encompassingentire analysis (e.g., R, DI2E Jira)Provide templates for reuse of codeProvide open documentation (e.g., bibliography of codelibrary)Don’t bake passwords and sensitive info into code inpublic repo.

Open Source Management§§§§§§§“Free like a puppy!”: Build and nurture the open sourcecommunity through code-a-thons, online forums, andregular engagement activities (e.g., Slack, Gitter, Discord)Provide guidelines for contributions (e.g., pull requestpolicy).Establish code of conductEncourage ”peer programming” or “mob programming”Make code modular and clean up as you goProvide open issue tracking (e.g., bugs, feature requests)Provide open road mapping (workflows, timelines)

Provide GuidanceSource: Astrokit.org EADME.md

Additional Examples HydroShare: NSF Collaborative Research: SI2-SSI: An InteractiveSoftware Infrastructure for Sustaining Collaborative CommunityInnovation in the showAward?AWD ID 1148453 and https://www.hydroshare.org (open source) SciDas: NSF CC*Data: National Cyberinfrastructure for ScientificData Analysis at D ID 1659300and -at-scalescidas/ (open source) iRods Open Source Data Management Software https://irods.org

Additional Examples NIH Data Commons is to accelerate new biomedical discoveriesby providing a cloud-based platform where investigators can store,share, access, and compute on digital objects (data, software, etc.)generated from biomedical research and perform novel scientificresearch including hypothesis generation, discovery, andvalidation. It is expected that awardees will participate collectivelyas a consortium and work cooperatively toward achieving NIH’scomprehensive vision for an interoperable, FAIR (Findable,Accessible, Interoperable and Reusable) compliant, multi-cloudNIH Data Commons founded on open source and open standards.The Commons will be designed to comply with the principles ofmaking digital objects FAIR.https://commonfund.nih.gov/bd2k/commons RM-17026 CommonsPilotPhase.pdf

Additional Resources Consumer Financial Protection Bureau GitHub:https://cfpb.github.io DOD Open Source Software FAQ: #OSS and DoD Policy GSA 18F posted an excellent roundup by Will Slack and BrittaGustafson that gathers citations and -publishing-opensource-code-in-government/ GSA 18F also hosted a guest post by DHS that shows a concreteexample of open source code leading to agencies saving on work,fixing each other's bugs, and generally just developing a greatstrategic collaboration: ation-across-agencies-to-improve-httpsdeployment/

Additional Resources Open Source Licensing by Lawrence Rosen with foreward by Prof.Lessig: http://www.rosenlaw.com/oslbook.htm Code 2.0 by Lessig: http://codev2.cc/download remix/LessigCodev2.pdf Cathedral and the Bazaar:http://www.catb.org/ esr/writings/cathedral-bazaar/ Choosing an open source our-development-project/ Choose an open source license (GitHub):https://choosealicense.com and https://choosealicense.com

Dr. Lea Shanleyco-Executive DirectorNSF South Big Data Innovation HubLShanley@renci.org@Lea ShanleyThe South Big Data Hub is funded in partthrough a grant from the NSF.Thank you to: Pamela Gay, NASA CADET and COSMOQUEST PI; Ian Webster,NASA CADET PI Astrokit.Org /CTO, Zenysis Technologies; Sarah Allen, SamGensburg; Bob Balance; Ben Balter, GitHub; Ray Idaszak, RENCI; Jacqueline Kazil,PIF/PyLadies; Eric Mill, 18F; Jennifer Hammock, Smithsonian; Mike Byrne, CFPB;James Miller, FCC

What is Open Science? § Open Access to articles and lab notebooks § Open Data § Open Source Code § Open Collaboration (e.g., citizen science) § Open Technology (e.g., Makers) § Open Funding OECD (2015), "Making Open Science a Reality", OECD Science, Technology and Industry Policy Papers, No. 25, OECD Publishing,

Related Documents:

100%: 100%. 100%: 100%. 100%: 100%. 100%: 100%. 100%: 100%. 100%: 100%. 100%: 100%. 100%: 2. Plain Cement Concrete: 100%. 100%: 100%. 100%: 100%. 100%: 100%. 100% .

pas de neurones Arbre pas de neurones C.elegans 302 neurones C.elegans 302 neurones Mouche 1 000 000 neurones Mouche 1 000 000 neurones Rat 1 000 000 000 n. Rat 1 000 000 000 n. Humain 10 000 000 000 000 n. Humain 10 000 000 000 000 n. Le cerveau génère des mouvements ( comportement)

Clarification NFF TFF Capto Q ImpRes Nuclease treatment Capto Core 700 TFF. 0 10 000 20 000 30 000 40 000 50 000 60 000 70 000 80 000 90 000 0 10 000 20 000 30 000 40 000 50 000 60 000 70 000 80 000 90 000 /batch Capital Materials Consumables Labour Othe

Salary Guide 2021 14 Japan - 2021 Financial Services Front Office Salary Guide Investment Banking Fixed Income Sales ( Per Annum) Job Title Low Average High Analyst 5,000,000 6,000,000 9,000,000 Associate 7,000,000 10,000,000 13,000,000 VP 10,000,000 15,000,000 20,000,000 Director Investment Banking Equity Research

06 uefadirect 7.09 VEREINE Festbetrag Spiel- Leistungs- Marktpool Achtel- Viertel- Halb- Endspiel TOTAL prämien prämie finale finale finale EUR Gruppe A CFR 1907 Cluj 3 000 000 2 400 000 900 000 1 305 000 7 605 000 Chelsea FC 3 000 000 2 400 000 2 400 000 15 414 000 2 200 000 2 500 000 3 000 000 30 914 000

National Rank in TALLENTEX 2020 : Prize in Rs. (Total : 1.25 Crore) Class X Each Class From V, VI & VII Class IX Class VIII 75,000 35,000 40,000 50,000 2 – 5 50,000 20,000 31,000 35,000 6 – 10 25,000 10,000 20,000 21,000 11 – 20 1 Lakh 50,000 1 Lakh 1 Lakh 1 8,000 3,000 4,000 5,00

10. What budget range have you established for your kitchen project? 5,000 – 10,000 10,000 - 20,000 20,000 – 40,000 40,000 – 60,000 60,000 – 75,000 75,000 – 100,000 100,000 11. How long do you intend to own the jobsite residence? a. Is return on investment a primary

2 MAGNETICS Core Locator & Unit Pack Quantity 55014 61 10,000 55015 61 10,000 55016 61 10,000 55017 61 10,000 55018 61 10,000 55019 61 10,000 55020 61 10,000 55021 61 10,000 55022 61 10,000 55023 61 10,000 55024 65 10,000 55025 65 10,000 55026 65 10,000 55027 65 10,000 55028 65 10,000 55029