Expectation Maximization - UConn Health

2y ago

10 Views

2 Downloads

624.38 KB

51 Pages

Last View : 1m ago

Last Download : 3m ago

Upload by : Ronan Garica

Report this link

Download PDF

Transcription

Expectation Maximization

Expectation-MaximizationE-MIf the underlying governing pdf is known only inits general form, and there may or may not bemissing data as well, we need E-M– To reconstruct the underlying pdf– To find missing data based on the underlying pdf

Example: Missing NormallyDistributed Data We have 4 data points from a Gaussiandistribution with unit variance. 2 datapoints are missing.––––511xx

We need to infer the parameters of the densityfrom which the observations were drawn. We are given one of the parameters in thiscase,variance 1 We seek the most likely for the Gaussian densityfunction; i.e. we seek ML Once we can find the mean of the underlyingGaussian distribution ( ML ), we can generate themissing data

We can take advantage of properties ofGaussian densities to make life easier– The value (arg) of that yields the least squareerror m(x )ii 1is simply the sample mean1n nxi 1 i2

So, to maximize the likelihoodestimate of the parameter ML itis necessary and sufficient to usethe sample mean for j ML arg min i 1 ( xi )m 2

Strategy Guess the parameter Expectation: Find the expected data, giventhe parameter Maximization: Find the most likelyparameter, given the data, by finding thesample mean Repeat and converge to a solution for ML

Example: Missing NormallyDistributed Data The first step is to guess at a mean for the pdf.Guess 0. The expectation of a normallydistributed value is the mean. (expectationstep) 511xx

Now minimize the error of the data estimate by using theexpected mean (maximization step)51100Now re-estimate the mean by using the data(expectation step)(5 11 0 0)/4 4

Now minimize the error of the data estimate by using thenew expected mean (maximization step)51144Now re-estimate a new mean by using the data(expectation step)(5 11 4 4)/4 6

Now minimize the error of the data estimate by using thenew expected mean (maximization step)51166Now re-estimate a new mean by using the data(expectation step)(5 11 6 6)/4 7.5

Now minimize the error of the data estimate by using thenew expected mean (maximization step)5117.57.5Now re-estimate a new mean by using the data(expectation step)(5 11 7.5 7.5)/4 7.75

ML values for subsequentiterations 7.75 7.875 7.9375 7.96875 7.984375 7.992188 7.996094 7.998047 7.999023 7.999512 7.999756The iteration isguaranteed toconverge to alocal minimum!

Types of Problems for E-M Parametric– We need to find the parameters of the pdf Non Parametric (Most Common)– We need to find pdf itself

Parametric ProblemExample(After Mitchell, Machine Learning)

Parametric ExampleThere are 2 Gaussian processes, mixing.We don’t see the processes nor do we knowthe parameters. (In the example, forsimplicity we will assume unit variance foreach, so only the means are the unknownparameters)We have observations, but we don’t knowabout the two specific processes to whichthe observations belong

EXAMPLE:GAUSSIAN MIXTURE

Source AunknownSource BunknownObservations from Sources A and B

We need to infer theparameters of the densities fromwhich the observations weredrawn

EMTypically, we have data which are generated bymultiple, say WLOG, 2, Gaussian probabilitydensities, A and B.The data are a1,a2,a3 .,b1,b2 ., but the difficulty isthat we only observe the unlabeled data x1,x2,x3 .,where the random variable x could be either an a orabWe need to figure out the parameters of the entiresystem which parameters of A and B that are mostlikely to have generated the data. If the system isGaussian, we are looking for and 2 for each of thegenerating densities.

EM: The Problem StatementAs stated, we are give theobservations x1,x2,x3 where the x areunlabeled data from A,B and we are lookingfor the parameter(s) of Then the labels that associate x with either aor b are the hidden data.

A MysteryWe don’t know the labelsBut.If we knew the parameters of each of themother densities, we could calculate theprobability that each x came from A and theprobability that it came from B, and choosethe more probable.

A SolutionWe have only the data. How can we knowwhat’s what?Answer: Assume there are labels for each datapoint xi. These labels, zi,1,zi2,.zi,j, areassociated with all j mother densities (inthis case, j 2), each telling the probabilitythat the jth density generated the data point

Expectation MaximizationSo We need a 2 step process1. Determine the expectations of the hiddenvariables, given the parametersp(Data Parameters)2. Determine the best parameters, given thedatap(Parameters Data)This is the likelihood of the data and wewish to maximize this likelihood

EM: In Summary We need to optimize the expressionP(Obs,Hidden )Given onlyP(Obs )by choosing the best

Getting StartedWhile we don’t know the initial values of theparameters, we can guessIf the guess is reasonable, this process can beshown to converge to a local maximumlikelihood (Dempster,Laird,Rubin)

EXAMPLEThe Gaussian Mixtures alreadypresentedLet the data set consist of a set of points[(xi,z1i,z2i)]Here, the observed data point is x and the hiddendata are the z’s. The zi,j represent the probabilityth densitythat xi came from the jExample after Mitchell, Machine Learning

Given 2 processes A, B (W.L.O.G.)Probability that x was generated froma process with mean AExpectation zA Probability that x was generated fromeither process with mean A or processwith mean B

Q:What is the probability of anyxi given process ( , 2) j?Answer for thisparticular model(Gaussian Mixture):p( xi ) 12 e2j 12 j(x )ij22

EXPECTATION StepE[ zi , j ] p( x xi j ) 2p(x x )inn 1Notice we are normalizing

Since it is a GaussianDistribution, we can define theprobabilities conciselyE[ zi , j ] e 12 n 1 e22(x )j2 i 12 2(x )n2 iThe denominator effectively ‘normalizes’ thenumerator, making it a probability measure

MAXIMIZATION STEPSo, the Maximum Likelihood estimate of the jthparameter (j th mean) is the normalized sum of the valueof each sample weighted by the probability of eachsample j nE[z]xi,jii 11 nx i 1 in

MAXIMIZATION STEP-explainedFor the the jth parameter (j th mean) : j Expectation of eachdata point using the jthparameternE[z]xi,jii 11 nx i 1 inSum of theexpectationsMean of theobserved data

DEMOhttp://bioinformatics.uchc.edu/Bioinformatics tools/EMdemo.aspHistogram60This is a singlesample of 1000points drawn fromthe random numbergenerator used in quency40

Non ParametricExampleAfter Karp,R University of Washington

The TaskWe observe blood types in a bunch of people– Theses types are A,B, AB, and O– (the Blood types determined by the Blood Bank are thephenotypes)– The phenotypes are the observed dataThe Task:Infer the frequencies ( ie a discrete pdf) of the bloodtype alleles A,B and O, using known principles ofgenetics, by means of the hidden data

What’s missing?We need to know the genotypes whichunderlie the phenotypic expression– The possible genotypes are AA, AO,OA,AB,BB, BO,OB,OO– These genotypes are the hidden data

Experiment: ObservePhenotypes Determine the blood type of 30 peopleSample:––––Type AType BType ABType O162111

Create a matrix for the alleleprobabilitiesStart with a guess at the probabilities‘left’ allele p( A) p(B) p(O) ‘right’alleleInitial guessp( A) .4 .4 p( B) .2 .2 p(O) .4 .4 The goal is to refine the guess into a most likelyestimate

Now create a matrix for thehidden data (the genotypes)based on the first guess of theallele probabilitiesAAAOOABBBOOBABBAOO .16 .16 .16 prob of genotypes determining phenotype A .04.08.08prob of genotypes determining phenotype B H prob of genotypes determining phenotype AB.08 .08 .16 prob of genotypes determining phenotypeO

EXPECTATIONNORMALIZE each row so that the rowentries represent a probability.Specifically, each entry in a row representsthe probability that the entry’s genotype(column heading) led to the entry’sphenotype (row heading).

Matrix normalizedAAAOOABBBOOBABBAOO .33 .33 .33 .2.4.4 H .5 .5 .16 .08.16 .16 .161 .04 .08 .08

MAXIMIZATION Now we wish to recover the probabilities of thealleles being drawn from the population.(Remember this was the goal) The Probability matrix was set up as a 2 columnmatrix– Column 1 was the probability of recovering the ‘left’allele from the population*– Column 2 was the probability of recovering the ‘right’allele from the population*In this particular example, the left and right allele probabilities are equal

MAXIMIZATIONNow, ask the question: Which entries havecontributed to a specific left allele, say, A?(We will ask and answer the same question for allalleles, both left and right)Answer: For the ‘left’ A, it is row 1 (A phenotype) col 1(genotype AA) row 1 (A phenotype) col 2 (genotype AO), row 3 (AB phenotype), col 7 (AB genotype)

AAAOOABBBOOBABBAOO .33 .33 .33 .2.4.4 H .5 .5 1. ABABO

MAXIMIZATION So, compute the probability of the Left Aallele as follows:– Sum of entries in row 1 (A phenoype).333 .333 .666There are 16 individuals with phenotype A– Sum of entries in row 3 (AB phenotype) .5There is one individual with phenotype AB

MAXIMIZATION P(Aleft) .667 16 0.5 1 11.06 Do the same calculation for Aright, for Bright,for for Oright, and for Oleft

Continuing the calculation for all alleles .LEFTRIGHT11.0611.061.71.716.7516.75ABOthen normalizing .ABOLEFT RIGHT0.375 0.3750.058 0.0580.568 0.568

MAXIMIZATIONWe thus recover the ‘new’ allele probability matrixThis was our goal p( A) p(B) p(O) p( A) .375 .375 p( B) .058 .058 p(O) .568 .568

EMNow, use the new Allele ProbabilityMatrix andITERATE!Quit when it .edu/Bioinformatics tools/EMDemo alleles.html

Many applications of EM, but we are interestedin three. Learning transition probabilities in a Hidden MarkovModel, given only the emissions (observations) astraining data. There is an efficient (polynomial)special-case algorithm, the Baum-Welch Algorithm,exploiting the Markov structure of the HMM . Learning a most likely motif subsequence given manysequences of data, each of which contains a codingsubsequence for the protein function of interest. Thealgorithm is called the Multiple-SequenceExpectation-Maximization Motif Elicitation (MEME) Unsupervised cluster discovery

MAXIMIZATION STEP 1 , 1 [] 1 n i i j i j n i i E z x x n P m So, the Maximum Likelihood estimate of the jth para

Related Documents:

UConn School of Business Information Technology Services

UConn School of Business, Room 225, Storrs To access the VPN through the web interface, go to https://vpn.uconn.edu Wireless Access Uconn provides wireless service to anyone with a University NetID. The Wireless ID is UCONN-SECURE. To learn how to access UCONN-SECURE from any

20 Views

2y ago

May is for Pink Lady's Slippers, Bluets and Pinxter Azalea ...

UConn Trails There are trails on the edge of the UConn campus that provide an opportunity to observe woodland birds and native plants. The 580-acre Fenton Tract is the largest contiguous parcel of the UConn Forest. It is located east of campus along the Fenton River. UConn Forest Trail Map. Upcoming Webinars. CLEAR 2021 Webinar Series UCONN

19 Views

2y ago

SPRING 2021 - UConn Magazine

at magazine.uconn.edu, email me at lisa.stiepock@uconn.edu, and send by regular mail to UConn Magazine Letters, 34 N. Eagleville Rd., Storrs, CT 06268-3144. LETTERS Cliff I have not returned to UConn since Robinson Your story on the tragic passing of Cliff Robinson certainly evoked some memories of what seems like an era long since passed.

11 Views

1y ago

Optimization with EM and Expectation-Conjugate-Gradient

Expectation Maximization We ﬁrst focus on the analysis of the convergenceproperties of the Expectation-Maximization (EM) algorithm. Con-sider a probabilistic model of observed data x which uses latent variables z. The log-likelihood (objective function

24 Views

2y ago

8. Mixture Models and Expectation-Maximization

Machine Learning for Computer Vision Expectation-Maximization EM is an elegant and powerful method for MLE problems with latent variables Main idea: model parameters and latent variables are estimated iteratively, where average over the latent variables (expectation) A typical exam

22 Views

2y ago

UConn and UNC Basketball: The Not-So Complete Picture of ...

recruiting players. A .PDF of the expense forms from UConn is here . Although on the budget forms UConn submitted to the NCAA and the Federal Department of Education, the Huskies claim 247,404 in recruiting expenses for basketball. Comparatively, the UConn women's basketball

28 Views

2y ago

UCONN

Oct 13, 2016 · UCONN MAGAZINE MA GAZINE.UCONN.EDU SPRING 2017. Teams of UConn students worked to solve puzzles and “Escape the Expected” as part of a game that saw Ratcliffe Hicks Arena transformed into rooms from hit . HBO shows. Competitors had just six minutes to find hidden codes and uncove

15 Views

2y ago

WRITING SAMPLE PAPER - Cambridge Institut

BEC HIGHER PART TWO Questions 2 – 4 4 Write an answer to one of the questions 2 – 4 in this part. Write 200 – 250 w ords on pages 5 and 6. Write the question number in the box at the top of page 5. Question 2 Your manager is keen to introduce new practices into your company. He has asked you to write a report which includes .

37 Views

3y ago

Recent Views

Fifth ASISA Insurance Gap Study

Insurance Gap Insurance Need -Actual Cover gap: k) www.truesouth.co.za Need for insurance Earnings R0.6m Replacement requirement 54% Capitalisation factor 13.8 Insurance need R4.6m Actual insurance Retail R1.5m Group Life R0.8m Government grants R0.0m Total R2.3m R4.6m -R2.3m R2.3m Average death insurance gap for richest 20% of SA .

1y ago

173 Views

FCA GAP Insurance research

purchase GAP insurance 6 2.6. Add-on GAP insurance purchasers are not a homogeneous group 6 2.7. The remedies may have provided reassurance, but have not yet helped improve knowledge 6 3. Profile of research participants 8 3.1. Car purchase 8 3.2. Demographics 8 3.3. Awareness of GAP insurance 8 3.4. Purchase of GAP insurance 9 3.5.

1y ago

160 Views

A world at risk Closing the insurance gap

Closing the insurance gap A world at risk 07 1. The size of the global insurance gap A world at risk, Lloyd's second underinsurance report, shows there is a global insurance gap of US 162.5 billion in 2018. This shows there is a significant gap between the level of insurance in place to cover

1y ago

142 Views

The Life Insurance Need Gap - LIMRA

Need Life Insurance Have Life Insurance The gap between "I need" and "I "have" equals 18-points, or 46 million consumers This understates unmet need in the market. Life Insurance Ownership Gap - 2011 to 2021 Source: 2021 Insurance Barometer Life Insurance Ownership Gap 18-points

1y ago

171 Views

Sample Gap Analysis Template

Traditionally, a skills gap analysis is undertaken using paper-based assessments and supporting interviews; however, technological advancements, such as skill management software, are allowing large companies to administer a skills gap analysis without using a significant proportion of human resources (Antonucci and d’Ovidio, 2012).File Size: 778KBPage Count: 24Explore furtherSkills gap analysis template - Skills for Care - Homewww.skillsforcare.org.uk40 Gap Analysis Templates & Exmaples (Word, Excel, PDF)templatelab.comConducting A Gap Analysis: A Four-Step .com(PDF) Gap Analysis - et30 FREE Gap Analysis Templates & Examples - .comRecommended to you b

2y ago

185 Views

Making Sense of GAP Insurance - How To Mind Your GAP

find more information under "What is excluded under a GAP insurance policy?". 9 These figures apply where the customer is required to pay a motor insurer's excess of 250. Some GAP insurance providers will pay an amount towards this excess. Please check your GAP insurance policy for details. Written off at 6 months Written off at 30 months

1y ago

132 Views

Personal insurance - Car & Business insurance King Price Insurance

The king's insurance options 5 Things you need to know 7 The stuff you need to do 14 How to claim 16 Our commitment to you 20 Car insurance 22 Car warranty 37 Shortfall cover 45 Scratch and dent 46 Tyre and rim 48 Motorbike insurance 53 Trailer and caravan insurance 64 Watercraft insurance 68 Home contents insurance 77 Buildings insurance 89

1y ago

685 Views

2 4 About Girl Ambassador Program (GAP) 6 Closing Gaps through GAP 7 .

GAP Pathways GAP Benefits Opportunities GAP Commitments Participants Parents Ambassadors GAP Process Get Connected 2 4 6 7 8 10 12 15 1. TABLE OF CONTENTS About Girls For A Change . GAP is a four-year, tiered approach that supports paced learning and development, where certified instructors

10m ago

114 Views

Gap Year Alumni Survey 2020 - Gap Year Association

Canadian gap year participants and a lack of knowledge about the "American" gap year. The Gap Year Alumni Survey of U.S. and Canadian gap year participants was conducted in 2020, following the first ever survey of its kind in 2015. Like the previous survey, the 2020 survey sought to capture the scale, scope, and outcomes of gap year .

10m ago

87 Views

INGENI SERVICES RTI a n d RPP GAP INSURANCE

Ingeni Services RTI and RPP GAP Insurance V10 April 2018 Page 2 of 13 INGENI SERVICES RTI and RPP GAP INSURANCE This module should be taken AFTER the generic ‘Finance & Total Gap Insurance - Part 1 - an overview’ Unit (Unit 8) within the FCA Refresher Training Course. All of the following produc

2y ago

355 Views

Gender Pay Gap Report 2020 - RSA Insurance Group

Pay Gap is 27.4%, our Mean Bonus Gap is 64.4% and our Median Bonus Gap is 43.0%. The information presented below relates to employees of Royal & Sun Alliance Insurance plc and is calculated in line with the government regulations. Please see overleaf for an explanation of the comparison between 2020 and previous years. Median Mean Gender Pay Gap

1y ago

142 Views

Gold Tier - MAPFRE Insurance

Foy Insurance of MA, LLC 198 Frank Consolati Insurance Agency, Inc. 198 County Insurance Agency, Inc. 198 Woodrow W Cross Agency 214 Woodland Insurance Agency, Inc. 214 Tegeler Insurance Services of CT, Inc. 214 Pantano/VonKahle Insurance Agency, Inc. 214 . Hanson Insurance Agency, Inc. 287 J.H. Slattery Insurance Agency, Inc. 287

1y ago

577 Views

Biba Webinar Gap Insurance

September 2015 - FCA introduced new rules for dealers selling GAP Insurance. WHY? To achieve better customer outcomes from more informed purchasing decisions; and Improved competition. FCA recognised GAP insurance premiums are significantly higher. Almost half of customers unaware they could buy GAP elsewhere.

1y ago

147 Views

Statutory Pay Gap Report 2019 Gender; Disability .

3. Statutory Gender Pay Gap Report 2019 In this section is reported the Statutory Gender Pay Gap, the Gender Pay Gap (Excluding Casual Staff), and a review of Bonus Pay. A positive black number, means that there is a pay gap in favour of men, whereas a negative red number means that there is a pay gap in favour of women. 3.1. Statutory Gender .

3y ago

222 Views

Gender Pay Gap Report - Gleeds

Gleeds Gender Pay Gap Report 2019 Gleeds figures 2018 PAY GAP This table shows the mean and median pay gap between men and women, based on hourly rates of pay and presented relative to men’s earnings. The median gender pay gap differs from the mean as it shows the mid-point of data, rather than the average. BONUS GAP

3y ago

172 Views

Expectation Maximization - UConn Health

It looks like you're using an ad-blocker