Gap Penalties

1y ago
71 Views
2 Downloads
233.52 KB
19 Pages
Last View : 7d ago
Last Download : 15d ago
Upload by : Farrah Jaffe
Transcription

Gap PenaltiesCMSC 423

General Gap CAThese have the same score, but the second one is often moreplausible.A single insertion of “GAAT” into the first string could changeit into the second. Now, the cost of a run of k gaps is gap k It might be more realistic to support general gap penalty, so thatthe score of a run of k gaps is gap(k) gap k. Then, the optimization will prefer to group gaps together.

General Gap CAPrevious DP no longer works with general gap penalties becausethe score of the last character depends on details of the -Instead, we need to “know” how long a final run of gaps is inorder to give a score to the last subproblem.

Three MatricesWe now keep 3 different matrices:M[i,j] score of best alignment of x[1.i] and y[1.j] ending with a charactercharacter match or mismatch.X[i,j] score of best alignment of x[1.i] and y[1.j] ending with a space in X.Y[i,j] score of best alignment of x[1.i] and y[1.j] ending with a space in Y.8 M [i 1, jM [i, j] match(i, j) max X[i 1, j :Y [i 1, j(M [i, jX[i, j] maxY [i, j(M [iY [i, j] maxX[i1]1]1]k] gap(k)k] gap(k)for 1 k jfor 1 k jgap(k)gap(k)for 1 k ifor 1 k ik, j]k, j]

The M MatrixWe now keep 3 different matrices:M[i,j] score of best alignment of x[1.i] and y[1.j] ending with a charactercharacter match or mismatch.X[i,j] score of best alignment of x[1.i] and y[1.j] ending with a space in X.Y[i,j] score of best alignment of x[1.i] and y[1.j] ending with a space in Y.By definition, alignmentends in a match.8 M [i 1, jM [i, j] match(i, j) max X[i 1, j :Y [i 1, jAny kind of alignment isallowed before the match.AG1]1]1]

The X (and Y) matriceskixG--- ACGT Gyj-k(M [i, jX[i, j] maxY [i, jixyjk] gap(k)k] gap(k)kG--- -CGT Gj-kjk decides how long tomake the gap.We have to make thewhole gap at once in orderto know how to score it.for 1 k jfor 1 k j

The X (and Y) matriceskixG--- ACGT Gyj-k(M [i, jX[i, j] maxY [i, jixyjk] gap(k)k] gap(k)kG--- -CGT Gj-kjk decides how long tomake the gap.We have to make thewhole gap at once in orderto know how to score it.for 1 k jfor 1 k jThis case is automaticallyhandled.kixy---- GCGT Gj-kj

Running Time for Gap Penalties8 M [i 1, jM [i, j] match(i, j) max X[i 1, j :Y [i 1, j(M [i, jX[i, j] maxY [i, j(M [iY [i, j] maxX[i1]1]1]k] gap(k)k] gap(k)for 1 k jfor 1 k jgap(k)gap(k)for 1 k ifor 1 k ik, j]k, j]Final score is max {M[n,m], X[n,m],Y[n,m]}.How do you do the traceback?Runtime: Assume X Y n for simplicity: 3n2 subproblems2n2 subproblems take O(n) time to solve (because we have to try all k)O(n3) total time

Affine Gap Penalties O(n3) for general gap penalties is usually too slow. We can still encourage spaces to group together using a special caseof general penalties called affine gap penalties:gap start the cost of starting a gapgap extend the cost of extending a gap by one more space Same idea of using 3 matrices, but now we don’t need to search overall gap lengths, we just have to know whether we are starting a newgap or not.

Affine Gap Penalties8 M [i 1, j 1]M [i, j] match(i, j) max X[i 1, j 1] match:Y [i 1, j 1]betweenx and y8 gap start gap extend M [i, jX[i, j] max gap extend X[i, j 1] :gap in xgap start gap extend Y [i, j8 gap start gap extend M [iY [i, j] max gap start gap extend X[i :gap in ygap extend Y [i 1, j]If previousalignment ends inmatch, this is anew gap1]1]1, j]1, j]

Affine Gap as Finite State Machinematch(i,j)Mgs gegs gematch(i,j)match(i,j)geYgs gegs geXge

Affine Base Cases (Global) M[0, i] “score of best alignment between 0 characters of x and icharacters of y that ends in a match” - because no such alignmentcan exist. X[0, i] “score of best alignment between 0 characters of x and icharacters of y that ends in a gap in x” gap start i gap extendbecause this alignment looks like:--------yyyyyyyyy X[i, 0] “score of best alignment between i characters of x and 0characters of y that ends in a gap in X” - xxxxxxxxx---------- not allowedM[i, 0] M[0, i] and Y[0, i] and Y[i,0] are computed using the same logicas X[i,0] and X[0,i]

Affine Gap Runtime 3mn subproblems Each one takes constant time Total runtime O(mn): back to the run time of the basic running time.Traceback Arrows now can point between matrices. The possible arrows are given, as usual, by the recurrence. E.g. What arrows are possible leaving a cell in the M matrix?

Why do you “need” 3 matrices? Alternative WRONG algorithm:M[i][j] max(M[i-1][j-1] cost(x[i], y[i]),M[i-1][j] gap (gap start if Arrow[i-1][j] ! ),M[j][i-1] gap (gap start if Arrow[i][j-1] ! ))WRONG Intuition: we only need to know whether we are starting a gap orextending a gap.The arrows coming out of each subproblem tell us how the best alignment ends, so wecan use them to decide if we are starting a new gap.The best alignmentup to this cell endsin a gap.The best alignmentup to this cell endsin a match.PROBLEM: The best alignment for stringsx[1.i] and y[1.j] doesn’t have to be usedin the best alignment betweenx[1.i 1] and y[1.j 1]

Why 3 Matrices: Examplematch 10, mismatch -2, gap -7, gap start -15CARTCA-TOPT(4, 3) optimal score 30 - 15 - 7 8CARTSCA-T-WRONG(5, 3) 30 - 15 - 7 - 15 - 7 -14CARTSCAT--OPT(5, 3) 20 - 2 - 15 - 14 -11this is why we need to keep the X and Y matrices around.they tell us the score of ending with a gap in one of the sequences.

Side Note: Lower Bounds Suppose the lengths of x and y are n. Clearly, need at least Ω(n) time to find their global alignment(have to read the strings!) The DP algorithms show global alignment can be done in O(n2) time.

Side Note: Lower Bounds Suppose the lengths of x and y are n. Clearly, need at least Ω(n) time to find their global alignment(have to read the strings!) The DP algorithms show global alignment can be done in O(n2) time. A trick called the “Four Russians Speedup” can make a similar dynamicprogramming algorithm run in O(n2 / log n) time. We probably won’t talk about the Four Russians Speedup.The important thing to remember is that only one of the four authors is Russian.(Alrazarov, Dinic, Kronrod, Faradzev, 1970) Open questions: Can we do better? Can we prove that we can’t dobetter? No one knows.

Recap Local alignment: extra “0” case. General gap penalties require 3 matrices and O(n3) time. Affine gap penalties require 3 matrices, but only O(n2) time.

What you should know by now. Dynamic programming framework Global & local sequence alignment algorithms with basic gappenalties Alignment with general gap penalties Alignment with affine gap penalties Longest common subsequence (board lecture) Subset Sum (board lecture)

General Gap Penalties Now, the cost of a run of k gaps is gap k It might be more realistic to support general gap penalty, so that the score of a run of k gaps is gap(k) gap k. Then, the optimization will prefer to group gaps together. AAAGAATTCA A-A-A-T-CA AAAGAATTCA AAA----TCA vs. Th

Related Documents:

Traditionally, a skills gap analysis is undertaken using paper-based assessments and supporting interviews; however, technological advancements, such as skill management software, are allowing large companies to administer a skills gap analysis without using a significant proportion of human resources (Antonucci and d’Ovidio, 2012).File Size: 778KBPage Count: 24Explore furtherSkills gap analysis template - Skills for Care - Homewww.skillsforcare.org.uk40 Gap Analysis Templates & Exmaples (Word, Excel, PDF)templatelab.comConducting A Gap Analysis: A Four-Step Templatewww.clearpointstrategy.com(PDF) Gap Analysis - ResearchGatewww.researchgate.net30 FREE Gap Analysis Templates & Examples - TemplateArchivetemplatearchive.comRecommended to you b

3. Statutory Gender Pay Gap Report 2019 In this section is reported the Statutory Gender Pay Gap, the Gender Pay Gap (Excluding Casual Staff), and a review of Bonus Pay. A positive black number, means that there is a pay gap in favour of men, whereas a negative red number means that there is a pay gap in favour of women. 3.1. Statutory Gender .

Gleeds Gender Pay Gap Report 2019 Gleeds figures 2018 PAY GAP This table shows the mean and median pay gap between men and women, based on hourly rates of pay and presented relative to men’s earnings. The median gender pay gap differs from the mean as it shows the mid-point of data, rather than the average. BONUS GAP

GAP Service Quality Model showed the key insights gained through the executive interviews and focus group interviews about the service quality concept. The gaps revealed by the executive interviews were shown in the marketer side (GAP 1, GAP 2, GAP 3, GAP 4), and the GAP 5 which was .

the presence of a gap is larger than this perimeter. Solar beams that fall into the gap can penetrate in the gap edge area, thereby increasing the actual gap area and thus the area where tree saplings can regeneration. Forest Gap Figure 1: Schematic representation of radiation in a forest gap. The abbreviations are explained in the text.

About Close the Gap Idaho: Close the Gap Idaho is a network of over 5,000 organizations and individuals statewide, working to support a complete solution to the coverage gap and to preserve health coverage for Idahoans. Close the Gap Idaho has led the effort to expand Medicaid in Idaho since 2014. A list of Close the Gap Idaho

Medicare Readmission Penalties By Hospital, Year 4 Medicare will apply these readmissions penalties to reimbursements from Oct. 1, 2015, through Sept. 30, 2016,

of branched rough paths introduced in (J. Differential Equations 248 (2010) 693–721). We first show that branched rough paths can equivalently be defined as γ-Hölder continuous paths in some Lie group, akin to geometric rough paths. We then show that every branched rough path can be encoded in a geometric rough path. More precisely, for every branched rough path Xlying above apathX .