Computational Aspects Of Molecular Structure

2y ago
26 Views
2 Downloads
6.34 MB
45 Pages
Last View : 1m ago
Last Download : 3m ago
Upload by : Camille Dion
Transcription

Computational aspects ofmolecular structureLecture 1Part 1: IntroductionTeresa Przytycka, Ph.D.

Why molecular structure? The function of a molecule is determined by its3-D structure.What type of computational aspects? Biophysical principles Sequence – structure relation Structure comparison Secondary and 3 D structure prediction (protein and RNA) Protein evolution from structural perspective Protein function Protein-protein interaction

Organization of modern organismsG GAC CTTAtranscriptionG GA UmRNAtranslation

More detailed esDNA-sequence of nucleotides{A,T,G,C}RNA- sequence of {A,U,G,C}Protein-sequence of amino-acids;{A,V,L,I,G,P, .}ProkaryotesFolded protein

DNA! A sequence nucleotides: adenine (A), cytosine(C), guanine (G) and thymine (T).! It is double stranded: single strand of DNA inone direction is paired to a complementarystrand) forming in 3-dimension double helix! Hydrogen bonding between complementary basepairs (the so called Watson-Creek base pairs)hold the two strands together. The base pairsare A-T, G-C! DNA has directionality that corresponds to thedirection of translation. The beginning isdenoted by 5’ and end by 3’.5’ AACTGC 3’3’ TTGACG 5’

RNA! Single stranded! Thymine (T) is replacedby uracyl (U).! Base-pairs within thesingle strand.

Proteins! Sequence of amino-acids (word over 20-letter alphabet: A,L,V, .);! No complementary paring! Each amino-acid has its distinct propertiesVLSGTGLVLHV.Range? totality ofAssumption: The native conformation is determinedby theinteratomic interactions and hence the amino acid sequence in agiven environment. (Anfinsen 1960)Hypothesis: The native fold corresponds to the conformation with freeenergy minimumWishful thinking: If we understand forces driving the folding processand or we should be able to compute the structure from sequence.

Representations Protein StructureSpace filling modelBackbone diagramRibbon diagram

Basis for backbone and ribbon representationGeneral structure of a polypeptide chain: Example VS LAmino terminusHiNHCαVOCi 1SHNCαC . NHHOHSide chainsBased on crystal structure of molecules containing oneor few peptide bonds Pauling discovered that C’ Odouble bound was longer than expected from while C’ N was shorter than expectedPauling’s explanation: resonance between two extremestructures:Result: Cα, Ci’, Oi, Ni 1,Hi 1 are coplanarHOHCarboxyl terminusCαLCO

Translation Basics: Genetic Code Amino-acids are encoded by triplets ofnucleotides called codons The genetic code is redundant: there are 64possible codons and 20 amino-acids special “termination” codon.

Second positionUFirst positionUCAGCAGThird positionThe genetic codeUCAGUCAGUCAGUCAG

RNAt-RNA Fundamental intranslation is the so calledtransfer RNA (tRNA)molecules – linked to aspecific amino-acid on oneside and containing the triplecomplementary to the codon(the anticodon) on the otherside.m-RNA (transcription)“functional” RNAAla 3'5'anticodon3'UGCACGcodon5'mRNA

Basic Methods for sequenceAlignment and Similarity Search(overview)Computational aspects of molecularstructureLecture 1, Part bTeresa Przytycka, Ph.D.

Assumptions: Biological sequences evolved by evolution. Evolutionary related sequences are likely to haverelated functions. We assume that evolution of biological sequencesproceeds by:– Substitutions– Insertions/Deletions Larger rearrangements of sequences are alsopossible but are modeled using different methods

Sequence alignment Write one sequence along the other so that to exposeany similarity between the sequences. Each element ofa sequence is either placed alongside of the alignment score (will be discussedlater).

Data Base Searches Goal: Given a protein sequence and a protein data basefind in the data base sequences that are homologous to agiven sequence. Why: Homologous sequences often have same or relatedfunction thus if we fish out a sequence with knownfunction this will provide a hint towards the function of thequery protein. Relevant issues:– Speed (!)– Ability of assessment of relevance of the results returned bysearch– Specificity and Sensitivity of the search

BLAST Basic Local Alignment Search Tool – a family of mostpopular sequence search program including Main idea: Homologous sequences are likely to contain ashort high scoring similarity region a hit.– Find two non-overlapping hits of length w (usually set to 3) ofscore at least T and distance at most d one from another– Invoke ungapped (cheep) extension.– If the HSP generated has score above certain threshold thenstart extension that allows gaps (expensive extension).– Report resulting alignment if it has sufficiently large statisticalsignificance (defined using e-value – see next slide) For BLAST tutorial visit http://www.ncbi.nlm.nih.gov/BLAST/

Significance of resultsP- value given the length of query sequenceand the size of the data base gives probabilityof finding an alignment with a certain scoreby chance.E-value expected number of “by chance” hits ofgiven or higher score (also depends on data basesize)Normalized score alignment score normalized so thatthe alignments obtained using different scoringfunctions can be directly compared.

Blast Web-exercise Open BLAST web serverBLAST: http://www.ncbi.nlm.nih.gov/BLAST/ Get a protein sequence to serve as a query(1bob) from Entrez

Other Data Base Searching tools FASTA - basic principles are similar to BLASTbut there are significant differences PSI-Blast – a tool for finding more distanthomologues - will be discussed in a later classes.

Sensitivity /Specificity of a data base searchRelatedRetrieved bythe searchNot retrievedby the searchUnrelatedTPFPTrue PositiveFalse PositiveFNTNFalse NegativeTrue NegativeSensitivity: TP/(TP FN) Typically, increasing TP leads toincreasing FP and decreasing FN thusSpecificity: TN/(TN FP) as we change parameters to increasePositive Predictive Value:TP/(TP FP)Sensitivity Specificity goes down.Need to take it into account incomparing various methods.

This slide is by Stephen Altschul from talk: acstalk1.pptReceiver Operating CharacteristiccurveFalse –10.8True Fraction 0.6relatedaccepted 0.40.2000.20.40.60.8Fraction unrelated acceptedFalse True –1

This slide is by Stephen Altschul from talk: acstalk1.pptROC score: area under the ROCcurve10.8Sensitivity of thesearch TP/(TP FN)Specificity of thesearch TN/(FP TN)So ROC plotsare plots ofSensitivity vs.(1-Specificity)Fraction 0.6relatedaccepted 0.40.2000.20.40.60.8Fraction unrelated accepted1

Multiple alignmentS1,S2, ,Sk a set of sequences over the same alphabetAs for pair-wise alignment we would like to findalignment that maximizes some scoring function:M Q P I L LLM L R – L- LMPVILILHow to score such multiple alignment?

Sum of pairs (SP) scoreExample consider all pairs of letters in eachcolumn and add the scores:( )SP-scoreAVV- score(A,V) score(V,V) score(V,-) score(A,-) score(A,V)k sequences gives k(k-1)/2 addendsRemark: Score(-,-) 0Entropy Score-Σ(cj/C)log (cj/C)

Entropy based score (minimum)-Σ(cj/C)jlog (cj/C)cj- number ofoccurrence of aminoacid j in the columnC – number of symbolsin the columnA A A A AA A A A IA A A A KA A A I LA A I I SA II I W----------------------0 -.68 -.9 -1 -2.58

Multiple sequence alignment algorithms MSA–Authors of the program and consecutive improvements:Carrillo , Lipman, Altschul, Shaffer, Gupta,Kecioglu, – extension of dynamic programming approach to more sequences– accurate but expensive CLUSTAL W–Authors: Higgins & Sharp– Produces an evolutionary tree and progressively aligns partial alignments inthe order guided by the tree – from leaves towards the root.– Fast but not perfect T-COFFE– Authors Noterdame, Higgins, Heringa, JMB 2000, 302 205-217– A hybrid approach, more accurate than CLASTAL W– Tries to negotiate best pair wise alignment based on several alignment andtransitive closure and then uses progressive tree-based alignment MUSCLE newest algorithm, almost as accurate as T-COFEE but fast.

Percent Accepted Mutation (PAM) unit evolutionary time corresponding to average of 1 mutation per 100 res. Two most popular classes of matrices: – PAMn: relates to mutation probabilities in evolutionary interval of n PAM units (PAM 120 is often used in practice) . PAM, BLOSUM) A(S i,S .

Related Documents:

The journal Molecular Biology covers a wide range of problems related to molecular, cell, and computational biology, including genomics, proteomics, bioinformatics, molecular virology and immunology, molecular development biology, and molecular evolution. Molecular Biology publishes reviews, mini-reviews, and experimental and theoretical works .

theoretical framework for computational dynamics. It allows applications to meet the broad range of computational modeling needs coherently and with fast, structure-based computational algorithms. The paper describes the SOA computational ar-chitecture, the DARTS computational dynamics software, and appl

Jan 31, 2011 · the molecular geometries for each chemical species using VSEPR. Below the picture of each molecule write the name of the geometry (e. g. linear, trigonal planar, etc.). Although you do not need to name the molecular shape for molecules and ions with more than one "central atom", you should be able to indicate the molecular geometryFile Size: 890KBPage Count: 7Explore furtherLab # 13: Molecular Models Quiz- Answer Key - Mr Palermowww.mrpalermo.comAnswer key - CHEMISTRYsiprogram.weebly.comVirtual Molecular Model Kit - Vmols - CheMagicchemagic.orgMolecular Modeling 1 Chem Labchemlab.truman.eduHow to Use a Molecular Model for Learning . - Chemistry Hallchemistryhall.comRecommended to you b

Xiangrun's Molecular sieve Email:info@xradsorbent.com Tel:86-533-3037068 Website: www.aluminaadsorbents.com Molecular sieve Types 3A Molecular sieve 4A Molecular sieve 5A Molecular sieve 13X Molecular sieve PSA Molecular Sieve Activated zeolite powder 3A Activated zeolite powder 4A Activated zeolite powder 5A

Methods in computational molecular physics : [proceedings of a NATO Advanced Study Institute on Methods in Computational Molecular Physics, held July 22 - August 2, 1991, in Bad Windsheim, Germany] Subject: New York [u.a.], Plenum Press, 1992 Keywords: Signatur des Originals (Print): RN

area of molecular electronics, with an emphasis on the re-lationship between molecular structure and electrical con-ductance and on the use of molecules for computational ap-plications. The essential science in these areas reflects the broader field of molecular electronics, and although cer-tain fundamental challenges have been faced, many oth-

computational science basics 5 TABLE 1.2 Topics for Two Quarters (20 Weeks) of a computational Physics Course.* Computational Physics I Computational Physics II Week Topics Chapter Week Topics Chapter 1 Nonlinear ODEs 9I, II 1 Ising model, Metropolis 15I algorithm 2 Chaotic

E. Kwan Lecture 9: Introduction to Computational Chemistry Chem 117 February 22, 2010. Introduction to Computational Chemistry Scope of Lecture Eugene E. Kwan Key Questions the PES introduction to computational chemistry Key References 1. Molecular Modeling Basics Jensen, J.H. CRC Press, 2009. 2. Computati