Sanger Vs Next-Gen Sequencing - University Of Nebraska Medical Center

1y ago
9 Views
2 Downloads
6.43 MB
12 Pages
Last View : 15d ago
Last Download : 3m ago
Upload by : Bria Koontz
Transcription

Tools and Algorithms in BioinformaticsGCBA815/MCGB815/BMI815, Fall 2017Week-8: Next-Gen SequencingRNA-seq Data AnalysisBabu Guda, Ph.D.Professor, Genetics, Cell Biology & AnatomyDirector, Bioinformatics and Systems Biology CoreUniversity of Nebraska Medical CenterFall, 2017GCBA/MGCB/BMI 815Sanger vs Next-Gen SequencingSource: https://www.google.com/url?sa i&rct j&q &esrc s&source images&cd &ved 0ahUKEwj356GajzWAhXEzFQKHZrlCh0QjRwIBw&url &psig AOvVaw3BHyDsG4jHY9z4Y3Jc11IY&ust 15079330652942891

Next-Gen Sequencing No in vivo cloningSource: ide1.jpgCost of Human Genome SequencingSource: 4/Screen-Shot-2017-04-24-at-11.40.38-AM.png2

Next-Gen Sequencing WorkflowSource: Lu and Shen, 2016, Biochemistry, Genetics and Molecular Biology. DOI: 10.5772/61657Applications of NGS Genome Whole genome sequencing Whole exome sequencing Targeted gene panels (cancer, newborns, autism, etc.)Transcriptome Whole RNA sequencing mRNA transcriptome (poly-A selection) Small RNA analysis (siRNA, snoRNA, lincRNA, etc.) Gene expression profiling for selected target genesMetagenome Bulk sequencing of many types of bacteria Examples: human gut microbiome, soil samples, food contamination,extremophiles, etc.Epigenome Chromatin Immunoprecipitation Sequencing (ChIP-Seq) Methylation Sequencing (Methyl-Seq)3

Different Sequencing LibrariesSource: http://slideplayer.com/7847747/25/images/7/Types of Sequencing Libraries.jpgPaired-end SequencingSource: vs-singleread-seq-web-graphic.jpg4

FASTQ Files from Paired-end SequencingSource: iplexing Mixed SamplesSource: re.gif5

Different File Types in NGS analysis Fastq file – generated by the sequencer, contains NGS reads SAM file – Sequence Alignment/Map (generated by aligning theNGS reads with the reference genome) BAM file – Binary version of the SAM file (SAMtools are used tomanipulate SAM/BAM files) GFF file – General Feature Format used to hold genomeannotation (chromosome, strand, frame, exon, CDS, etc.) GTF file – Gene Transfer Format (Also contains all the info as inGFF and in addition contains gene annotation information) VCF file – Variant Call Format (used to store variant data suchas SNPs, InDels, short structural rearrangements)Fall, 2017GCBA/MGCB/BMI CACAAGAAGATCACTGGACTGCCCTCGCTCAGCCCTCAGCTACTG ? ?@ ? @@ ?@@ @@@@@? ?@?@?@A? @@@? @@?A@:@A@@A@@@A@@AAB@@BBRow 1: Information from the sequencer about the location of this readon the plateRow 2: The SequenceRow 3: Metadata provided by the sequencing teamRow 4: Quality scores pertaining to each nucleotide in the sequence6

FASTQ format:FASTQ is based on the popular FASTA format for sequencesFASTA format sequence ID; header in one lineAGTTGTAGTCCGTGATAGTCGGATCGGFASTQ format provides additional information that includes the quality ACCTCAAGTGATCCGCCCGCCTCGGCCTCCCAACGTTTTGG ? @7 B ;;BB? B? 8539 6?6 8 BB B 08:9@5;:A@@?@9:BAAA ?;8;@AC@BBBBBA? 9-@B@;CAA77 :BEB BB@07?@ ?84ASCII code for Quality score (Phred score, ranges from 0-50)ASCII code for Quality score (in the increasing order; ! is the worst and is the bestFall, 2017GCBA/MGCB/BMI 815Sequence Alignment / Map (SAM / BAM)SRR098401.104031357 83 chr22 17445857 60 76M 17445512 CCTTTACAGATGAGAAGGCCGTCACGCCTC@@ B@@@BBAAAB9A@@ :@@? A@?@?@A? ?@? ?@@@@@ @ @@@ ?@ @ @@8? ? :@ ? JMKCKLINJMMLJKKKMOOMNNOLPQSNMKK PG:Z:MarkDuplicates MNONMMMMLMKKKMLGMNLNMMNNJMJLNOMLNMPNONONNMM NM:i:0 MQ:i:60 AS:i:76 XS:i:0Similar to the Fastq file in that it contains the raw sequence and itsquality scores.It also tells you where the sequence aligned to the genome, and howwell (this scre is also phred-scaled).In this case, this read aligned to chromosome 22, position 17445857,and has a quality score of 60 (or a 1 in 1,000,000 chance of beingplaced incorrectly).7

Variant Call Format (VCF)RNA-Seq Data Analysis8

Computational Analysis of RNA-Seq DataSource: Conesa et al., Genome Biology, 2016, 17:13RNA-Seq Data Analysis WorkflowIllumina, Ion Torrent,PacBioFastQC, FQTrimSTAR, HISAT, TopHat,Sailfish, SalmonCufflinks, EdgeR, DESeqCuffDiff, DESeq, DegeR,LimmaGSEA, IPA, DAVID, GO, etc.9

Input Files for RNA-seq AnalysisDownload TestData file from theCourse Page andunzip the folderGalaxy Serverhttps://usegalaxy.org/ A large compilation of open-source NGS data analysis tools thatare accessible to users on web-based platforms Data can be uploaded from a PC/Mac and computing can be doneon the cloud No need to install tools and maintain servers locally In-depth tutorials are available to use Galaxy services A list of Public Galaxy Servers can be found at day’s RNA-seq analysis will be performed from the following link https://bioinf-galaxian.erasmusmc.nl/galaxy/10

Phred Score (Q) Q 10 log10 P&Base Sequence Quality InterpretationBad QualityQuality drops at the tail endExcellent QualityBad Quality11

Read Mapping and AssemblySource: https://home.cc.umanitoba.ca/ frist/PLNT7690/lec12/lec12.3.htmlDownstream Analysis of RNA-seq ResultsHierarchical ClusteringIPA: Ingenuity Pathway AnalysisGSEA- Gene Set Enrichment AnalysisSource: Yoo et al., Nature Genetics, 2014Source: Li et al, Scientific Reports, 2015Source: Graner et al, Front. Oncology, 2015Source: Bee et al., PLoS ONE, 201112

Next-Gen Sequencing Workflow Source: Lu and Shen, 2016, Biochemistry, Genetics and Molecular Biology. DOI: 10.5772/61657 Genome Whole genome sequencing Whole exome sequencing Targeted gene panels (cancer, newborns, autism, etc.) Transcriptome Whole RNA sequencing mRNA transcriptome (poly-A selection)

Related Documents:

Electric Sanger has two providers of electric services in the ity as follows. Sanger Electric Utilities: The vast majority of electric service in Sanger is provided by Sanger Electric Utilities which is wholly owned and operated by the ity of Sanger. Sanger Elect

In order to deliver consistent high quality Sanger sequenc-ing data, the UMGC sequences positive control samples on each Sanger plate, and carries out ongoing data qual-ity monitoring. All Sanger plates are run with two pGEM plasmid controls, sequenced with the M13(-21) primer, and placed in asymmetrically located wells to allow for the

V. Denise Saunders Irion A. Sanger Portland General Electric Company Sidney Villanueva 121 SW Salmon Street Sanger Law, PC Portland, Oregon 97204 1117 SE 53rd Avenue denise.saunders@pgn.com Portland, Oregon 97215 irion@sanger-law.com sidney@sanger-law.com Jay S

Sanger sequencing (1st generation. sequencing) - - - - - Frederick Sanger 1977. First sequence: Bacteriophage Phi X 174. Shearing of DNA . Cloning in bacteria. To each sequence reaction dNTP's (dATP,dGTP,dTTP, dCTP) and one of the four ddNTP's are added. The ddNTP's are incorporated randomly by the DNA polymerase. Determine the .

Next Generation Sequencing (NGS) Impact of NGS. 1st generation sequencing - Sanger sequencing - utilizes chain terminating dideoxynucleotides - slow and laborious, method has been relatively unchanged for 30 years - data mixture of sequences - sequence data can be reviewed manually

AP LAB 23: SANGER SEQUENCING It is a bit of a conceptual leap from the discovery of the structure of DNA to the sequencing of the human genome, but one leads directly to the other. The double-helix

Also referred to as Next-Generation Sequencing Parallelize the sequencing process, producing thousands or millions of sequences concurrently Lower the cost of DNA sequencing beyond what is possible with standard dye-terminator methods. In ultra-high-throughput sequencing as many as 500,000 sequencing-by-synthesis operations may

AngularJS is an extensible and exciting new JavaScript MVC framework developed by Google for building well-designed, structured and interactive single-page applications (SPA). It lays strong emphasis on Testing and Development best practices such as templating and declarative bi-directional data binding. This cheat sheet co-authored by Ravi Kiran and Suprotim Agarwal, aims at providing a quick .