Sequence Alignment

2y ago

66 Views

3 Downloads

448.28 KB

42 Pages

Last View : 21d ago

Last Download : 3m ago

Upload by : Julia Hutchens

Report this link

Download PDF

Transcription

Sequence alignmentlAlignment specifies which positions in two sequencesmatchacgtctag actctag-acgtctag -actctagacgtctag ac-tctag2 matches5 mismatches1 not aligned5 matches2 mismatches1 not aligned7 matches0 mismatches1 not alignedIntroduction to bioinformatics, Autumn 200741

Mutations: Insertions, deletions andsubstitutionsIndel: insertion ordeletion of a basewith respect to theancestor sequencelacgtctag -actctagMismatch: substitution(point mutation) ofa single baseInsertions and/or deletions are called indels We can’t tell whether the ancestor sequence had a base ornot at indel positionIntroduction to bioinformatics, Autumn 200742

ProblemslWhat sorts of alignments should be considered?lHow to score alignments?lHow to find optimal or good scoring alignments?lHow to evaluate the statistical significance of scores?In this course, we discuss each of these problemsbriefly.Course Biological sequence analysis tackles all four indepth.Introduction to bioinformatics, Autumn 200743

Sequence Alignment (chapter 6)lThe biological problemlGlobal alignmentlLocal alignmentlMultiple alignmentIntroduction to bioinformatics, Autumn 200744

Global alignmentlProblem: find optimal scoring alignment between twosequences (Needleman & Wunsch 1970)lEvery position in both sequences is included in the alignmentlWe give score for each position in alignmentl Identity (match) 1WHAT Substitution (mismatch)-µ IndelWH-YTotal score: sum of position scoresS(WHAT/WH-Y) 1 1 –Introduction to bioinformatics, Autumn 2007–µ45

Dynamic programminglllHow to find the optimal alignment?We use previous solutions for optimal alignments ofsmaller subsequencesThis general approach is known as dynamicprogrammingIntroduction to bioinformatics, Autumn 200746

Introduction to dynamic programming:the money change problemlllSuppose you buy a pen for 4.23 and pay for with a5 noteYou get 77 cents in change – what coins is the cashiergoing to give you if he or she tries to minimise thenumber of coins?The usual algorithm: start with largest coin(denominator), proceed to smaller coins until nochange is left: l50, 20, 5 and 2 centsThis greedy algorithm is incorrect, in the sense that itdoes not always give you the correct answerIntroduction to bioinformatics, Autumn 200747

The money change problem How else to compute the77change?50205 We could consider all possible275772ways to reduce the amount ofchange Suppose we have 77 cents7 22 7 37 52 22 52 67change, and the followingcoins: 50, 20, 5 cents We can compute the change Many values are computedwith recursionmore than once! Figure shows the recursion This leads to a correct buttree for the examplevery inefficient algorithmIntroduction to bioinformatics, Autumn 200748

The money change problemlWe can speed the computation up by solving thechange problem for all i n llExample: solve the problem for 9 cents with available coinsbeing 1, 2 and 5 centsSolve the problem in steps, first for 1 cent, then 2cents, and so onIn each step, utilise the solutions from the previousstepsIntroduction to bioinformatics, Autumn 200749

The money change problemAmount ofchange leftll0123456789Algorithm runs in time proportional to Md, where M isthe amount of change and d is the number of cointypesThe same technique of storing solutions ofsubproblems can be utilised in aligning sequencesIntroduction to bioinformatics, Autumn 200750

Representing alignments and scoresAlignments can berepresented in thefollowing tabular form.Each alignmentcorresponds to a paththrough the table.-WHAXXTWWHATH YXXWH-YIntroduction to bioinformatics, Autumn 200751

Representing alignments and scores-WHAT -WH-YHA22-T0WHGlobal alignmentscore S3,4 2- -µWYIntroduction to bioinformatics, Autumn 200712- -µ52

Filling the alignment matrix-WHAWConsider the alignment processat shaded square.Case 1. Align H against H(match or substitution).Case 1Case 2HCase 3YTCase 2. Align H in WHY against– (indel) in WHAT.Case 3. Align H in WHATagainst – (indel) in WHY.Introduction to bioinformatics, Autumn 200753

Filling the alignment matrix (2)-WHATScoring the alternatives.Case 1. S2,2 S1,1 s(2, 2)-Case 2. S2,2 S1,2WCase 1Case 2s(i, j) 1 for matching positions,HCase 3YCase 3. S2,2 S2,1s(i, j) - µ for substitutions.Choose the case (path) thatyields the maximum score.Keep track of path choices.Introduction to bioinformatics, Autumn 200754

Global alignment: formaldevelopmentA a1a2a3 an,B b1b2b3 bmb1b2b3b4--a1-a2a3Any alignment can be writtenas a unique path through thematrixlScore for aligning A and B upto positions i and j:0-1a12a23a301234-b1b2b3b4lSi,j S(a1a2a3 ai, b1b2b3 bj)Introduction to bioinformatics, Autumn 200755

Scoring partial alignmentslAlignment of A a1a2a3 an with B b1b2b3 bm can end inthree ways Case 1: (a1a2 ai-1) ai(b1b2 bj-1) bj Case 2: (a1a2 ai-1) ai(b1b2 bj) - Case 3: (a1a2 ai) –(b1b2 bj-1) bjIntroduction to bioinformatics, Autumn 200756

Scoring alignmentslScores for each case: Case 1: (a1a2 ai-1) ai(b1b2 bj-1) bj s(ai, bj) { -µ otherwiseCase 2: (a1a2 ai-1) ai(b1b2 bj) – 1 if ai bjCase 3: (a1a2 ai) –s(ai, -) s(-, bj) -(b1b2 bj-1) bjIntroduction to bioinformatics, Autumn 200757

Scoring alignments (2) First row and first columncorrespond to initial alignmentagainst indels:S(i, 0) -iS(0, j) -j0 Optimal global alignmentscore S(A, B) Sn,m-01234-b1b2b3b4-2-3-401a12a2 -23a3 -3Introduction to bioinformatics, Autumn 200758

Algorithm for global alignmentInput sequences A, B, n A , m B Set Si,0 : - i for all iSet S0,j : - j for all jfor i : 1 to nfor j : 1 to mSi,j : max{Si-1,j – , Si-1,j-1 s(ai,bj), Si,j-1 – }endendAlgorithm takes O(nm) time and space.Introduction to bioinformatics, Autumn 200759

Global alignment: example-TGGTG-2-4-6-8-10µ 1-0 2A-2T-4C-6G-8T-10Introduction to bioinformatics, Autumn 2007?60

Global alignment: example (2)-TGGTGµ 1-0-2-4-6-8-10 3-4T-10-7-4-30-2ATCGT -TGGTGIntroduction to bioinformatics, Autumn 200761

Sequence Alignment (chapter 6)lThe biological problemlGlobal alignmentlLocal alignmentlMultiple alignmentIntroduction to bioinformatics, Autumn 200762

Local alignment: rationale Otherwise dissimilar proteins may have local regions ofsimilarity- Proteins may share a functionHuman bonemorphogenic proteinreceptor type IIprecursor (left) has a300 aa region thatresembles 291 aaregion in TGFreceptor (right).The shared functionhere is protein kinase.Introduction to bioinformatics, Autumn 200763

Local alignment: rationaleABRegions ofsimilarity Global alignment would be inadequate Problem: find the highest scoring local alignmentbetween two sequences Previous algorithm with minor modifications solves thisproblem (Smith & Waterman 1981)Introduction to bioinformatics, Autumn 200764

From global to local alignmentlModifications to the global alignment algorithm Look for the highest-scoring path in the alignment matrix(not necessarily through the matrix), or in other words: Allow preceding and trailing indels without penaltyIntroduction to bioinformatics, Autumn 200765

Scoring local alignmentsA a1a2a3 an, B b1b2b3 bmLet I and J be intervals (substrings) of A and B,respectively:,Best local alignment score:where S(I, J) is the score for substrings I and J.Introduction to bioinformatics, Autumn 200766

Allowing preceding and trailingindels First row and columninitialised to zero:Mi,0 M0,j 0b1 b2 b3- - a101234-b1b2b3b4000000-1a1 02a2 03a3 0Introduction to bioinformatics, Autumn 200767

Recursion for local alignment Mi,j max {Mi-1,j-1 s(ai, bi),Mi-1,j ,Mi,j-1 0Introduction to bioinformatics, Autumn 200768

Finding best local alignment Optimal score is the highestvalue in the matrix maxi,j Mi,j Best local alignment can befound by backtracking fromthe highest value in ntroduction to bioinformatics, Autumn 200769

Local alignment: C04T05A06A07G08G0Introduction to bioinformatics, Autumn 200770

Local alignment: exampleScoringMatch: 2Mismatch: -1Indel: -2C T – A AC T C A ion to bioinformatics, Autumn 200771

Non-uniform mismatch penaltieslWe used uniform penalty for mismatches:s(’A’, ’C’) s(’A’, ’G’) s(’G’, ’T’) µlTransition mutations (A- G, G- A, C- T, T- C) areapproximately twice as frequent than transversions (A T, T- A, A- C, G- T) use non-uniform mismatchACGTpenalties collected into aA1-1-0.5-1substitution matrixC-11-1-0.5G-0.5-11-1T-1-0.5-11Introduction to bioinformatics, Autumn 200772

Gaps in alignmentlGap is a succession of indels in alignmentC T – - - A AC T C G C A AllPrevious model scored a length k gap as w(k) -kReplication processes may produce longer stretchesof insertions or deletions In coding regions, insertions or deletions of codons maypreserve functionalityIntroduction to bioinformatics, Autumn 200773

Gap open and extension penalties (2)lWe can design a score that allows the penalty openinggap to be larger than extending the gap:w(k) -ll(k – 1)Gap open cost , Gap extension costOur previous algorithm can be extended to use w(k)(not discussed on this course)Introduction to bioinformatics, Autumn 200774

Amino acid sequenceslWe have discussed mainly dna sequenceslAmino acid sequences can be aligned as wellllHowever, the design of the substitution matrix is moreinvolved because of the larger alphabetMore on the topic in the course Biological sequenceanalysisIntroduction to bioinformatics, Autumn 200775

Sequence Alignment (chapter 6)lThe biological problemlGlobal alignmentlLocal alignmentlMultiple alignmentIntroduction to bioinformatics, Autumn 200776

Multiple alignment Consider a set of nsequences on the right– Orthologous sequences fromdifferent organisms– Paralogs from multipleduplications How can we studyrelationships between tion to bioinformatics, Autumn 200777

Optimal alignment of threesequenceslllllAlignment of A a1a2 ai and B b1b2 bj can endeither in (-, bj), (ai, bj) or (ai, -)22 – 1 3 alternativesAlignment of A, B and C c1c2 ck can end in 23 – 1ways: (ai, -, -), (-, bj, -), (-, -, ck), (-, bj, ck), (ai, -, ck), (ai,bj, -) or (ai, bj, ck)Solve the recursion using three-dimensional dynamicprogramming matrix: O(n3) time and spaceGeneralizes to n sequences but impractical withmoderate number of sequencesIntroduction to bioinformatics, Autumn 200778

Multiple alignment in practicellIn practice, real-world multiple alignment problems areusually solved with heuristicsProgressive multiple alignment Choose two sequences and align them Choose third sequence w.r.t. two previous sequences andalign the third against them Repeat until all sequences have been aligned Different options how to choose sequences and scorealignmentsIntroduction to bioinformatics, Autumn 200779

Multiple alignment in practicelProfile-based progressive multiple alignment:CLUSTALW Construct a distance matrix of all pairs of sequences usingdynamic programming Progressively align pairs in order of decreasing similarity CLUSTALW uses various heuristics to contribute toaccuracyIntroduction to bioinformatics, Autumn 200780

Additional materiallllR. Durbin, S. Eddy, A. Krogh, G. Mitchison: Biologicalsequence analysisN. C. Jones, P. A. Pevzner: An introduction tobioinformatics algorithmsCourse Biological sequence analysis in Spring 2008Introduction to bioinformatics, Autumn 200781

Demonstration of the EBI web sitelEuropean Bioinformatics Institute (EBI) offers manybiological databases and bioinformatics tools athttp://www.ebi.ac.uk/Introduction to bioinformatics, Autumn 200782

Introduction to bioinformatics, Autumn 2007 49 The money change problem l We can speed the computation up by solving the change problem for all i n Example: solve the problem for 9 cents with available coins being 1, 2 and 5 cents l S

Related Documents:

Alignment REPORTER

Alignment REPORTER website www.alignment-reporter.com. 3.2.1. Installing the Windows software If using the Alignment REPORTER CD, place it in the CD-ROM drive. The Alignment REPORTER welcome screen should appear automatically. If not in possession of the CD, visit www.alignment-reporter.com to create an account and download the software.

70 Views

3y ago

sequence pattern Lecture 7: Sequence Motif Discovery - Otago

The profile matrixfor a given motif contains frequency counts for each letter at each position of the isolated conserved region. 8 Sequence logo and consensus sequence We can extract the so-called consensus sequence, i.e. the string of most frequent letters: A graphical representation of the consensus sequence is called a sequence logo:

16 Views

1y ago

Positive selection within the genomes of SARS-CoV-2 and ...

Sequence alignment To identify branch speciﬁc positive selection, it is necessary to obtain a query and a . reference alignment. We downloaded six high quality reference genomes from the . to build a sequence alignment. Next, we reﬁned the alignment using a gene by gene tation(i.e.,ORF1a .

39 Views

3y ago

Perl Pipelines - gorgonzola.cshl.edu

Perl Pipelines Using perl as bioinformatics glue Simon Prochnik with code from Scott Cain 1. Some of what cover in the course Gene prediction EST alignment to genome EST sequence Phylogenetics: paup, MrBayes, phyml, PAML multiple sequence alignment: clustalw, muscle pairwise sequence alignment:

8 Views

1y ago

Sequence comparison: Introduction and motivation

Sequence comparison: -Find the best alignment of two sequences -Find the best match (alignment) of a given sequence in a large dataset of sequences -Find the best alignment of multiple sequences Motif and gene finding Relationship between sequences -Phylogeny Clustering and classification Many many many more

10 Views

1y ago

REPORTS Alignment Uncertainty and Genomic Analysis

for each alignment (1). Estimates of phylogeny and inferences of pos-itive selection were sensitive to alignment treat-ment. Confirming previous studies showing that alignment method has a considerable effect on tree topology (12–14), we found that 46.2% of the 1502 ORFs had one or more differing trees depending on the alignment procedure used.

41 Views

3y ago

Low uncertainty alignment procedure using computer ...

This procedure was developed for the alignment of the Wide Field Corrector for the Hobby Eberly Telescope, which uses center references to provide the data for the system alignment. From previous experiments, we determined that using an alignment telescope or similar instrument would not achieve the required alignment uncertainty.

49 Views

3y ago

500750X Users Guide QX5 - Northern Tool

Insert Chuck into Alignment Port Press the Alignment Button down and hold it. Match the either of the Alignment Guides on the Chuck with the 118 Notch on the Alignment Port. Insert the Chuck. While holding the button down, slide the drill bit forward until it touches the Drill Stop and the Chuck is pushed all the way into the Alignment Port .

10 Views

1y ago

Recent Views

Request for Proposals for VILLAGE SOLICITOR

705.11 Village solicitor or city director of law - duties. “The village solicitor or city director of law shall act as the legal advisor to and attorney for the municipal corporation, and for all officers of the municipal corporation in matters relating to their official duties.

2y ago

131 Views

Appellee'S Answering Brief - Dol

APPELLEE'S ANSWERING BRIEF . KATE S. O'SCANNLAIN THOMAS TSO . Solicitor of Labor Counsel for Appellate and Special Litigation . G. WILLIAM SCOTT Associate Solicitor JEFFREY M. HAHN for Plan Benefits Security Senior Trial Attorney . U.S. Department of Labor Office of the Solicitor 200 Constitution Ave. N.W. Room N-4611 Washington, D.C. 20210

1y ago

137 Views

Alb erta Solicitor Gene nd Min of Publi c Sec

Alberta Solicitor General and Ministry of Public Security Alberta Basic Security Training Jan-14 Module One: Introduction to the Security Industry, Page 1 Module One: Introduction to the Security Industry When you decided to enroll in this course, you presumably did so because you have a desire or need to work in the security industry in Alberta.

3y ago

219 Views

REQUEST FOR PROPOSALS FOR PROFESSIONAL SERVICES FOR .

The State Solicitor is appointed by the Attorney General and reports through the Chief Deputy Attorney General. As set forth at 29 Del. C. § 2505 (b), the State Solicitor is responsible for all civil actions and matters wherein the State or its agencies or subdivisions are involved and has such powers as the Attorney General shall designate.

3y ago

193 Views

Interviewing and advising

(d) To have the solicitor’s full attention. (e) To sit in reasonable comfort. (f) No physical barriers between us and our solicitor to impede communication. (g) Not to be kept waiting. The bare minimum, then, would appear to be a comfortable, quiet room where you

3y ago

148 Views

Information for Trainee Solicitors on CV Preparation

Solicitor who trained in large law firm with a wide range of experience in banking and financial services, debt collection, commercial property and company law. EDUCATION April 2011 PPC II Completed - Qualified as a solicitor with Law Society of Ireland 2003-2007 Trinity College Dublin - BCL - 2.1 Honours

3y ago

226 Views

STATE OF TEXAS, OFFICE OF THE SPECIAL MASTER Solicitor .

Jul 05, 2019 · FREDERICK LIU Assistant to the Solicitor General US Department of Justice 950 Pennsylvania Avenue, NW Washington, D.C. 20530-0001 JAMES J. DUBOIS* R. LEE LEININGER THOMAS K. SNODGRASS U.S. Department of Justice Environment & Natural Resources Division 999 18th Street South Terrace – Suite 3

2y ago

156 Views

WILLIAM K W LEUNG & CO

Articleship with : Isadore Goldman, a top City of London firm specializing in insolvency (corporate); recovery; commercial and international litigation. 1986 – 1988 (2) Assistant Solicitor with . Wilkinson & Grist, a reputable local solicitor firm . 1990 (3) Corporate Finance Associate wit

2y ago

185 Views

Newsletter of Tweed Valley Jazz Club

SOUTHERN CROSS CREDIT UNION 2 Commercial Road, Murwillumbah Ph. 6672 2744 TWEED ENDEAVOUR CRUISES River Terrace, Tweed Heads Ph. 0755 368800 RUSSELL J BAXTER SOLICITOR N.S.W. & QLD (Honorary Club Solicitor) 28 Recreation Street, Tweed Heads Ph. 0755 992266 AN

2y ago

309 Views

IN THE HULL CROWN COURT R -v- WILLIAM FLANNIGAN

Adams, who worked for a well-known and reputable firm in Hull known as Cooper Wilkin Chapman. It is clear from the evidence that the recommendation to use this solicitor was just that; a recommendation. There was no obligation to use this solicitor and the purchasers were at liberty to use th

1y ago

140 Views

JAMES S. M. KITCHEN Suite 224 BARRISTER & SOLICITOR Airdrie AB T4B 3C3

BARRISTER & SOLICITOR. 203-304 Main St S Suite 224 . Airdrie AB T4B 3C3 . Phone: 403-667-8575 . Email: james@jsmklaw.ca [2] COVID mRNA vaccines (Pfizer and Moderna). She is also compelled to maintain both the physical and spiritual integrity of her body by asserting her God-given prior right to decline

1y ago

206 Views

Rules of Court - WordPress

and includes a Registrar, Court interpreter, bailiff, clerk, process server or other officer who is attached to a Court; "solicitor" means an advocate and solicitor as defined in section 3 of the Legal Profession Act 1976 [Act 166]; "Registry" means the Registry of the High Court, the Sessions Court or the Magistrates' Court;

1y ago

131 Views

Financial Statement for a variation of an order for a financial remedy .

If there is not enough room on the form for any particular piece of information, you may continue on an . attached sheet of paper. If you are in doubt about how to complete any part of this form you should seek legal advice. This statement is filed by (give name and address of solicitor) Solicitor's fee account no.

10m ago

111 Views

SOLICITOR - Charleston County Bar Association

101 Meeting St., Ste 400, Charleston, SC 29401. Main No: 843 958-1900 Fx:- 843 958-1905 Front Desk Line #2: 843 958-1927 . Direct Access is 843 958-**** Website: www.scsolicitor9.org. E-mail: solicitor@scsolicitor9.org. . Moncks Corner, SC 29461. Berkeley Number. 843 719-4529 . Charleston Number .843 723- 3800 ext.4529 .

8m ago

151 Views

DIGES T - academy.difc.ae

The Qualified Lawyers Transfer Scheme (QLTS) allows qualified lawyers in other jurisdictions to qualify as a solicitor in England and Wales. The English legal profession is relatively open to international lawyers seeking to qualify as a solicitor and it does not impose restrictions to admission on grounds of nationality or residence.

7m ago

74 Views

Sequence Alignment

It looks like you're using an ad-blocker