Max-Planck-Institut Für Molekulare Genetik Integrative Analysis Of NGS Data

1y ago
10 Views
2 Downloads
5.44 MB
31 Pages
Last View : 1d ago
Last Download : 3m ago
Upload by : Camden Erdman
Transcription

Max-Planck-Institutfür molekulare GenetikIntegrative analysis of NGS dataAlena van Bömmel (Alena.vanBoemmel@molgen.mpg.de R 3.3.8)Wolfgang Kopp (kopp@molgen.mpg.de R 3.3.18)Max Planck Institute for Molecular GeneticsSoftware Praktikum, 13.03.2017Folie 1

Max-Planck-Institutfür molekulare GenetikBiological backgroundSoftware Praktikum, 13.03.2017

Max-Planck-Institutfür molekulare GenetikGene expressionGene XSoftware Praktikum, 13.03.2017

Max-Planck-Institutfür molekulare GenetikGene expressionRNAGene XSoftware Praktikum, 13.03.2017

Max-Planck-Institutfür molekulare GenetikDNASoftware Praktikum, 13.03.2017

Max-Planck-Institutfür molekulare GenetikGene regulation by TFsSoftware Praktikum, 13.03.2017

Max-Planck-Institutfür molekulare GenetikGene regulation by TFsSoftware Praktikum, 13.03.2017

Max-Planck-Institutfür molekulare GenetikGene regulation by TFsSoftware Praktikum, 13.03.2017

Max-Planck-Institutfür molekulare GenetikDNA packagingSoftware Praktikum, 13.03.2017

Max-Planck-Institutfür molekulare GenetikNucleosome and histonesSoftware Praktikum, 13.03.2017

Max-Planck-Institutfür molekulare GenetikHistone modificationsLawrence et al., Trends in Genetics 2016Software Praktikum, 13.03.2017

Max-Planck-Institutfür molekulare GenetikExperimental assaysSoftware Praktikum, 13.03.2017

Max-Planck-Institutfür molekulare GenetikChIP-seqSoftware Praktikum, 13.03.2017Map reads to the genome

Max-Planck-Institutfür molekulare GenetikChIP-seq (2) Pros:Direct measure of genome-wide protein-DNA interaction(*) Cons:o Don't know whether binding causes changes in gene expressiono Need an antibody against your protein of interesto ExpensiveoSoftware Praktikum, 13.03.2017

Max-Planck-Institutfür molekulare GenetikSequencing data raw data reads usually verylarge file (few GB)format fastq (ENCODE) or SRA(Sequence Read Archive of NCBI)Analysis1) Quality control with fastqc, .2) Mapping of the reads to thereference genome (bwa or Bowtie)3) Visualizing the genomic regions(deepTools, IGV)4) Peak calling (MACS2)Software Praktikum, 13.03.2017Example of fastq data fileFolie 17

Max-Planck-Institutfür molekulare GenetikSoftware Praktikum, 13.03.2017Folie 18

Max-Planck-Institutfür molekulare GenetikRNA-seq data raw data reads usually verylarge file (few GB)format fastq (ENCODE) or SRA(Sequence Read Archive of NCBI)Analysis1) Quality control with fastqc2) Mapping of the reads to thereference genome (tophat2)3) Visualizing the genomic regions(IGV)4) Gene expression levels (in FPKMusing Cufflinks)Software Praktikum, 13.03.2017Example of fastq data fileFolie 19

Max-Planck-Institutfür molekulare GenetikTasksSoftware Praktikum, 13.03.2017

Max-Planck-Institutfür molekulare GenetikTasks Analysis of TF binding across the genome (TAF1, JUND) Analysis of histone modifications across the genome (H3K4me3,H3K4me1, H3K27ac) Cell-types: K562, GM12878 and H1-hESC (one per group) From the ENCODE project (see papers) genome.ucsc.edu/ENCODE or https://www.encodeproject.org/Software Praktikum, 13.03.2017

Max-Planck-Institutfür molekulare GenetikGroup Each group should work in a different cell-typeGroup 1: K562Group 2: GM12878Group 3: H1-hESCSoftware Praktikum, 13.03.2017Folie 29

Max-Planck-Institutfür molekulare GenetikLiterature surveyWhat is TAF1, H3K4me3, H3K4me1, H3K27ac and JUND? Where does one find those marks or proteins in the genome?Do they bind to promoters and/or enhancers?What are their roles in gene regulation?Are there known motifs associated with the TFs (e.g. Jaspar)?What is the role of high and low CpG promoters?Where can you find the dataset? Specify the exact source andname of the file/experiment (including RNA-seq for your cellline). Find publications that address those pointsUse Google and/or scholar.google.comUntil next MondaySoftware Praktikum, 13.03.2017Folie 30

Max-Planck-Institutfür molekulare GenetikPreliminary analysis steps (ChIP-seq) Download ChIP-seq raw reads (fastq/fq) for TAF1, JUND,H3K4me1, H3K4me3 and H3K27acAlso, download corresponding Input (control)experimentsAlign the ChIP-seq reads to hg19 with bowtie2Check the ChIP-seq qualityl Using fastqc and phantompeakqualtools (only forChIP-seq. Hint: Is NSC and RSC acceptable?)l Is the quality sufficient? Why or why not?Call peaks for all experiments with macs2lSoftware Praktikum, 13.03.2017Folie 31

Max-Planck-Institutfür molekulare GenetikPreliminary analysis steps (RNA-seq) Download RNA-seq reads (fastq)Align the RNA-seq reads to hg19 with tophat2If paired-end, there must be two fastq filesCheck the RNA-seq qualityl Using fastqcl Is the quality sufficient? Why or why not?Compute FPKM expression values with cufflinkslSoftware Praktikum, 13.03.2017Folie 32

Max-Planck-Institutfür molekulare GenetikGenomic features and overlap analysisll Do the peaks overlap (for different marks and proteins)?l Bedtools or R/Bioconductor: Genomic Rangesl Draw a Venn-diagramShare the peak regions with the other groupsl What is the overlap with the other groups?Which genomic features do they overlap with?l Intergenic, gene body, promoters, exons, introns, etc.l Generate a heatmap centered at the peak summit (withdeepTools)l Generate a profile aligned at the TSS (with deepTools)l Interpret the resultslSoftware Praktikum, 13.03.2017Folie 34

Max-Planck-Institutfür molekulare GenetikSequence analysis Extract the sequences from the peak regionsl Using R/Bioconductor or bedtoolsAnalyse motifs in the sequencesl Using MEME-ChIPl Which motifs do you find? Interpret the resultsDo the TAF1 peaks overlap with promoters? Are these high orlow CpG promoters? (Hint: analyse dinucleotide frequency)Software Praktikum, 13.03.2017Folie 35

Max-Planck-Institutfür molekulare GenetikGene expression analysis How do the peaks explain gene expression levels?l Correlation or linear regressionl How well does the H3K4me3 level at a promoterexplain gene expression?l How well does TAF1 level at promoters predictgene expression?l How well does JUND predict gene expressionl How well does H3K27ac and H3K4melSoftware Praktikum, 13.03.2017Folie 36

Max-Planck-Institutfür molekulare GenetikSchedule 13.03. Introduction lecture20.03. Presentation of the detailed plan of each group(Literature survey, data file information, schedule)10:15am, 11:00am, 11:45amevery Monday 10:15am, 11:00am, 11:45am progressmeetings27.04. Final report deadline03.05. Discussion of final reports08.05. Final presentationsSoftware Praktikum, 13.03.2017Folie 37

Max-Planck-Institutfür molekulare GenetikBioinformatics resourcesREAD THE MANUALS! Bowtie2 and bwa (to align ChIP-seq reads)Tophat2 (to align RNA-seq reads)Samtools (to convert SAM files to BAM files)Cufflinks (to determine gene expression levels)Bedtools (to analyse genomic regions – e.g. overlap, distance,extracting DNA sequences for some regions, find closest gene, .)Fastqc (to analyse the ChIP-seq/RNA-seq quality)Phantompeakqualtools (to analyse ChIP-seq quality – Crosscorrelation plot, etc.)DeepTools (to plot average profiles and heatmaps)MEME-ChIP (to discovery motifs)Bioconductor www.bioconductor.org/Software Praktikum, 13.03.2017Folie 38

Max-Planck-Institutfür molekulare GenetikUseful resources JASPARIGVGenome.ucsc.edu/ENCODE and www.encodeproject.orgGoogle and ownloads.htmlhttps://www.gencodegenes.org/ (Gene annotations, Hint: hg19corresponds to GRCh37)Software Praktikum, 13.03.2017Folie 39

Max-Planck-Institutfür molekulare GenetikUseful resourceslENCODE papers (An intergated encyclopedia of DNA elements in the humangenome, etc.)llBailey et al Practical Guidelines for the Comprehensive Analysis of ChIP-seq Data.PLoS Comput Biol (2013). (This explains some quality aspects of ChIP-seq data)llSaxonov et al A genome-wide analysis of CpG dinucleotides in the human genomedistinguishes two distinct classes of promoters (2006).llAny papers that explain TAF1, JUND, H3K4me4, H3K4me1, K3K27acllAny papers that explain the methodsSoftware Praktikum, 13.03.2017Folie 40

Max-Planck-Institutfür molekulare GenetikOffice hourslAlena: Monday and Tuesday at 1:30 pmllWolfgang: Thursday and Friday at 9:30 amSoftware Praktikum, 13.03.2017Folie 41

Genomic features and overlap analysis l Do the peaks overlap (for different marks and proteins)? l Bedtools or R/Bioconductor: Genomic Ranges l Draw a Venn-diagram l Share the peak regions with the other groups l What is the overlap with the other groups? Which genomic features do they overlap with? l Intergenic, gene body, promoters, exons .

Related Documents:

Participants Ian T. Baldwin Max-Planck-Institut fur Chemische Okologie, Tatzendpromenade la, D-07745 Jena, Germany Michael Beale IACR-Long Ashton Research Station, University of Bristol, Long Ashton, Bristol, BS18 9AF, UK Jorg Bohlmann Institute of Biological Chemistry, Washington State University, Pullman,WA 99164-6340, USA Wilhelm Boland Max-Planck-Institut fur Chemische Okologie .

FÜR EUROPÄISCHE RECHTSGESCHICHTE MAX PLANCK INSTITUTE FOR EUROPEAN LEGAL HISTORY www.rg.mpg.de Max Planck Institute for European Legal History Alejandro Agüero Ancient Constitution or paternal government? Extraordinary powers as legal res

Max Planck Encyclopedia of Public International Law www.mpepil.com Law Online from Oxford University Press Max Planck Encyclopedia of Public International Law Page 6 Content Expansion

NORTHGATE COLOR PALETTE 2 See page 5 for specifications and warranty details. 3 Max Def Burnt Sienna Max Def Moire Black Max Def Driftwood Max Def Pewter Max Def Georgetown Gray Max Def Resawn Shake Max Def Granite Gray Max Def Weathered Wood Max Def Heather Blend Silver Birch CRRC Product ID 0668-0072 Max Def Hunter Green NORTHGATE

Max. acceleration Z m/s2 5.0 Repeat accuracy/axis /- mm 0.1 LP 100 HS Max. payload kg 220 Max. carrier length X m 120 Max. stroke Z mm 2,300 Max. traverse speed X m/min 300 Max. acceleration X m/s2 4.5 Max. traverse speed Z m/min 120 Max. acceleration Z m/s2 5.0 Repeat accuracy/axis /- mm 0.1 LP 100 HD Max. payload kg 280 Max. carrier length .

Quantum Gravity becomes important at the Planck Scale Quantum General Relativity only makes sense for lengths much bigger than the Planck length. Quantum ßuctuations in the spacetime metric become large at the Planck scale. If we try to use a Planck energy probe to study the structur

Autmatic coil winding unit with complete protective fencing RINGROL 300 T E RINGROL 400 B E RINGROL 560 B E Coil diameter max. 300 mm max. 400 mm max. 560 mm Coil width max. 100 mm max. 200 mm max. 200 mm Coil weight max. 10 kg max. 15 kg max. 25 kg Core diameter 100 / 120 / 150 collapsible (other sizes upon request) 120 - 250 mm

Co-Editor of the journal Rechtsgeschichte – Legal History (Vittorio Klostermann/Max Planck Institute for European Legal History, together with Stefan Vogenauer) Co-Editor of the series Studien zur Europäischen R