GENERAL CONVERGENT EXPECTATION MAXIMIZATION

2y ago
16 Views
2 Downloads
718.03 KB
23 Pages
Last View : 2m ago
Last Download : 3m ago
Upload by : Abby Duckworth
Transcription

doi:10.3934/ipi.xx.xx.xxInverse Problems and ImagingVolume X, No. 0X, 20xx, X–XXGENERAL CONVERGENT EXPECTATION MAXIMIZATION(EM)-TYPE ALGORITHMS FOR IMAGE RECONSTRUCTIONMing YanDepartment of MathematicsUniversity of California, Los AngelesLos Angeles, CA 90095, USAAlex A.T. BuiDepartment of Radiological SciencesUniversity of California, Los AngelesLos Angeles, CA 90095, USAJason CongDepartment of Computer SciencesUniversity of California, Los AngelesLos Angeles, CA 90095, USALuminita A. VeseDepartment of MathematicsUniversity of California, Los AngelesLos Angeles, CA 90095, USA(Communicated by the associate editor name)Abstract. Obtaining high quality images is very important in many areas ofapplied sciences, such as medical imaging, optical microscopy, and astronomy.Image reconstruction can be considered as solving the ill-posed and inverseproblem y Ax n, where x is the image to be reconstructed and n is the unknown noise. In this paper, we propose general robust expectation maximization (EM)-Type algorithms for image reconstruction. Both Poisson noise andGaussian noise types are considered. The EM-Type algorithms are performedusing iteratively EM (or SART for weighted Gaussian noise) and regularizationin the image domain. The convergence of these algorithms is proved in severalways: EM with a priori information and alternating minimization methods.To show the efficiency of EM-Type algorithms, the application in computerizedtomography reconstruction is chosen.1. Introduction. Obtaining high quality images is very important in many areasof applied science, such as medical imaging, optical microscopy, and astronomy.For some applications such as positron-emission-tomography (PET) and computedtomography (CT), analytical methods for image reconstruction are available. For instance, filtered back projection (FBP) is the most commonly used method for imagereconstruction from CT by manufacturers of commercial imaging equipments [41].2010 Mathematics Subject Classification. Primary: 90C30, 94A08; Secondary: 90C26, 92C55.Key words and phrases. Expectation maximization (EM), Richardson-Lucy, simultaneous algebraic reconstruction technique (SART), image reconstruction, total variation (TV), computerizedtomography.This work is supported by the Center for Domain-Specific Computing (CDSC) funded by theNSF Expedition in Computing Award CCF-0926127.1c 20xx American Institute of Mathematical Sciences

2Ming Yan, Alex A.T. Bui, Jason Cong and Luminita A. VeseHowever, it is sensitive to noise and suffers from streak artifacts (star artifacts). Analternative to this analytical reconstruction is the use of the iterative reconstructiontechnique, which is quite different from FBP. The main advantages of the iterativereconstruction technique over FBP are insensitivity to noise and flexibility [28].The data can be collected over any set of lines, the projections do not have to bedistributed uniformly in angle, and the projections can be even incomplete (limitedangle). With the help of parallel computing and graphics processing units (GPUs),even iterative methods can be solved very fast. Therefore, iterative methods become more and more important, and we will focus on the iterative reconstructiontechnique only.The degradation model can be formulated as a linear inverse and ill-posed problem:(1)y Ax b n.Here, y is the measured data (vector in RM in the discrete case). A is a compactoperator (matrix in RM N in the discrete case). For all the applications we willconsider, the entries of A are nonnegative and A does not have full column rank. xis the desired exact image (vector in RN in the discrete case). b is the backgroundemission and n is the noise (both are vectors in RM in the discrete case). We willconsider the case without background emission (b 0) in this paper. Since thematrix A does not have full column rank, the computation of x directly by findingthe inverse of A is not reasonable because (1) is ill-posed and n is unknown. Evenfor the case without noise (n 0), there are many solutions because A does nothave full column rank. When there is noise in the measured data (n 6 0), findingx is more difficult because of the unknown n. Therefore regularization techniquesare needed for solving these problems efficiently.One powerful technique for applying regularization is the Bayesian model, anda general Bayesian model for image reconstruction was proposed by Geman andGeman [15], and Grenander [18]. The idea is to use a priori information about theimage x to be reconstructed. In the Bayesian approach, we assume that measureddata y is a realization of a multi-valued random variable, denoted by Y and theimage x is also considered as a realization of another multi-valued random variable,denoted by X. Therefore, Bayes’ theorem gives us(2)pX (x y) pY (y x)pX (x).pY (y)This is a conditional probability of having X x given that y is the measured data.After inserting the detected value y, we obtain a posterior probability distribution ofX. Then we can find x such that pX (x y) is maximized, as maximum a posteriori(MAP) likelihood estimation.In general, X is assigned a Gibbs random field, which is a random variable withthe following probability distribution(3)pX (x) e βJ(x) ,where J(x) is a given energy function (J(x) can be non-convex), and β is a positiveparameter. There are many different choices for J(x) depending on the applications. Some examples are, for instance, quadratic penalization J(x) kxk22 /2[12, 34], quadratic gradient J(x) k xk22 /2 [54], total variation J(x) k x k1 ,and modified total variation [38, 40, 14, 6, 49, 51]. We also mention Good’s roughness penalization J(x) k x 2 /xk1 [26] for a slightly different regularization. WeInverse Problems and ImagingVolume X, No. X (20xx), X–XX

General convergent EM-Type algorithms3assume here that k · k1 and k · k2 are the 1 and 2 norms respectively, and that xdenotes the discrete gradient operator of an image x.For the choices of probability densities pY (y x), we can choose2pY (y x) e kAx yk2 /(2σ(4)2)in the case of additive Gaussian noise, and the minimization of the negative loglikelihood function gives us the famous Tikhonov regularization method [46]1(5)minimize kAx yk22 βJ(x).x2If the random variable Y of the detected values y follows a Poisson distribution[31, 42] with an expectation value provided by Ax instead of Gaussian distribution,we haveY (Ax)yiie (Ax)i .(6)yi Poisson{(Ax)i }, i.e., pY (y x) y!iiBy minimizing the negative log-likelihood function, we obtain the following optimization problemX (7)minimize(Ax)i yi log(Ax)i βJ(x).x 0iIn this paper, we will focus on solving (5) and (7). It is easy to see that theobjective functions in (5) and (7) are convex if J(x) is a convex function. Additionally, with suitably chosen regularization J(x), the objective functions are strictlyconvex, and the solutions to these problems are unique. If J(x) k x k1 , i.e,.the regularization is the total variation, the well-posedness of the regularizationproblems is shown in [1] and [6] for Gaussian and Poisson noise respectively. In thispaper, we also consider a more general case where J(x) is non-convex.Relevant prior works are by Le et al. [31], Brune et al. [6, 7, 8], and Sidky etal. [44]. Additionally, we refer to Jia et al. [23], Jung et al. [27], Setzer et al. [39],Jafarpour et al. [22], Harmany et al. [19], Willet et al. [48], among other work. Wealso refer to the Compressive Sensing Resources [55]. The difference of this workfrom those will be discussed later.The paper is organized as follows. In section 2, we will give a short introductionof expectation maximization (EM) iteration, or Richardson-Lucy algorithm, used inimage reconstruction without background emission from the view of optimization.In section 3, we will propose general EM-Type algorithms for image reconstructionwithout background emission when the measured data is corrupted by Poisson noise.This is based on the maximum a posteriori likelihood estimation and an EM step.In this section, these EM-Type algorithms are shown to be equivalent to EM algorithms with a priori information. In addition, these EM-Type algorithms are alsoconsidered as alternating minimization methods for equivalent optimization problems, and the convergence results are obtained from both convex and non-convexJ(x). When the noise is weighted Gaussian noise, we also have the similar EMType algorithms. Simultaneous algebraic reconstruction technique is shown to beEM algorithm in section 4, and EM-Type algorithms for weighted Gaussian noiseare introduced in section 5. In section 5, we also show the convergence analysis ofEM-Type algorithms for weighted Gaussian noise via EM algorithms with a prioriinformation and alternating minimization methods. Some numerical experimentson image reconstruction are given in section 6 to show the efficiency of the EM-Typealgorithms. We will end this work by a short conclusion section.Inverse Problems and ImagingVolume X, No. X (20xx), X–XX

4Ming Yan, Alex A.T. Bui, Jason Cong and Luminita A. Vese2. Expectation Maximization (EM) Iteration. A maximum likelihood (ML)method for image reconstruction based on Poisson data was introduced by Sheppand Vardi [42] in 1982 for image reconstruction in emission tomography. In fact,this algorithm was originally proposed by Richardson [37] in 1972 and Lucy [33] in1974 for image deblurring in astronomy. The ML method is a method for solvingthe special case of problem (7) without regularization term, i.e., J(x) is a constant,which means we do not have any a priori information about the image. Fromequation (6), for given measured data y, we have a function of x, the likelihood ofx, defined by pY (y x). Then a ML estimation of the unknown image is defined asany maximizer x of pY (y x).By taking the negative log-likelihood, one obtains, up to an additive constant,X f0 (x) (Ax)i yi log(Ax)i ,iand the problem is to minimize this function f0 (x) on the nonnegative orthant,because we have the constraint that the image x is nonnegative. In fact, we haveX yi (Ax)i yi f0 (x) C,f (x) DKL (y, Ax) : yi log(Ax)iiwhere DKL (y, Ax) is the Kullback-Leibler (KL) divergence of Ax from y, and Cis a constant independent of x. The KL divergence is considered as a data-fidelityfunction for Poisson data just like the standard least-square kAx yk22 is the datafidelity function for additive Gaussian noise. DKL (y, ·) is convex, nonnegative andcoercive on the nonnegative orthant, so the minimizers exist and are global.In order to find a minimizer of f (x) with the constraint xj 0 for all j, we cansolve the Karush-Kuhn-Tucker (KKT) conditions [30, 29], X yiAi,j (1 ) sj 0,j 1, · · · , N,(Ax)iisj 0,xj 0,j 1, · · · , N,Ts x 0,where sj is the Lagrange multiplier corresponding to the constraint xj 0. By thepositivity of {xj }, {sj } and the complementary slackness condition sT x 0, wehave sj xj 0 for every j {1, · · · , N }. Multiplying by xj gives us X yij 1, · · · , N.Ai,j (1 ) xj 0,(Ax)iiTherefore, we have the following iteration scheme P yiAi,j ( (Axk) )iP(8)xk 1 ixkj .jAi,jiThis is the well-known EM iteration or Richardson-Lucy algorithm in image reconstruction, and an important property of it is that it preserves positivity. If xk ispositive, then xk 1Pis also positive if A preservesthatP positivity.P It is alsoP shownPfor each iteration, (Ax)i is fixed and equalsyi . Since (Ax)i ( Ai,j )xj ,iiijithe minimizer has a weighted l1 constraint.Shepp and Vardi showed in [42] that this is equivalent to the EM algorithmproposed by Dempster, Laird and Rubin [13]. To make it clear, EM iterationInverse Problems and ImagingVolume X, No. X (20xx), X–XX

General convergent EM-Type algorithms5means the special EM method used in image reconstruction, while EM algorithmmeans the general EM algorithm for solving missing data problems.3. EM-Type Algorithms for Poisson data. The method shown in the lastsection is also called maximum-likelihood expectation maximization (ML-EM) reconstruction, because it is a maximum likelihood approach without any Bayesianassumption on the images. If additional a priori information about the image isgiven, we have maximum a posteriori probability (MAP) approach [21, 32], whichis the case with regularization term J(x). Again we assume here that the detecteddata is corrupted by Poisson noise, and the regularization problem isXminimize E P (x) : βJ(x) ((Ax)i yi log(Ax)i ) ,x(9)isubject to xj 0, j 1, · · · , N.This is still a convex constraint optimization problem if J is convex and we can findthe optimal solution by solving the KKT conditions: X yi) sj 0,j 1, · · · , N,β J(x)j Ai,j (1 (Ax)iisj 0,xj 0,j 1, · · · , N,Ts x 0.Here sj is the Lagrangian multiplier corresponding to the constraint xj 0. Bythe positivity of {xj }, {sj } and the complementary slackness condition sT x 0,we have sj xj 0 for every j {1, · · · , N }. Thus we obtain X yi) xj 0,j 1, · · · , N,βxj J(x)j Ai,j (1 (Ax)iior equivalentlyP xjβP J(x)j xj Ai,jii yi)Ai,j ( (Ax)iPxj 0,Ai,jj 1, · · · , N.iNotice that the last term on the left hand side is an EM step (8). After pluggingthe EM step into the equation, we obtainxj J(x)j xj xEM 0,j 1, · · · , N,(10)βPjAi,jiwhich is the optimality condition for the following optimization problemXX (11)minimize E1P (x, xEM ) : βJ(x) (Ai,j ) xj xEMlog xj .jxjiTherefore we propose the general EM-Type algorithms in Algorithm 1. Theinitial guess x0 can be any positive initial image, and , chosen for the stoppingcriteria, is a small constant. N um Iter is the maximum number of iterations. If1J(x) is constant, the second step is just xk xk 2 and this is exactly the MLEM from the previous section. When J(x) is not constant, we have to solve anoptimization problem at each iteration. In general, the problem can not be solvedanalytically, and we have to use iterative methods to solve it. However, in practice,we do not have to solve it exactly to the steady state, just a few iterations areInverse Problems and ImagingVolume X, No. X (20xx), X–XX

6Ming Yan, Alex A.T. Bui, Jason Cong and Luminita A. Vesesufficient. We will show that the algorithm will also converge without solving itexactly.Algorithm 1 Proposed EM-Type algorithms.Input: x0 , for k 0 : N um Iter do1xk 2 EM (xk ) using (8),1xk 1 argmin E1P (x, xk 2 ) by solving (11),xif kxk 1 xk k/kxk k thenBreak,end ifend forThere are several other methods for solving this problem using EM iteration,we mention two of them and show the differences with our method. One of themethods is a modified EM iteration [17, 35] which is P yiAi,j ( (Axk) )ixk 1 Pixk .jAi,j β J(xk )j jiThis can be considered as solving a modified version of (10):xjβP J(xk )j xj xEM 0,jAi,jj 1, · · · , N.iThe convergence is given only when β is small [17]. It is easy to notice that whenβ is large, xj will be negative for some j’s and the projection onto the non-negativecone is needed. In this case, there is no convergence result.Another algorithm is also a two stage method, and the difference is in the secondstage [6, 8, 7]. Instead of solving (10) directly, which is a weighted Poisson denoisingproblem, a semi-implicit scheme is developed and it becomes a problem of weightedGaussian denoising. The semi-implicit scheme isxkjβP J(x)j xj xEM 0,jAi,jj 1, · · · , N.iHowever, there is no convergence for this algorithm and the convergence providedin [6] is for the following damped version:(12)xkjβP J(x)j xj (wxEM (1 w)xkj ) 0,jAi,jj 1, · · · , N.iThere is a bound on w for each iteration to show the convergence of damped algorithm [6], and the bound is difficult to find. In the following, we will show theconvergence of our algorithm without any assumptions on β and additional parameters. It will converge to a global minimum for convex J(x) and a subsequence willconverge to a local minimum for non-convex J(x).Inverse Problems and ImagingVolume X, No. X (20xx), X–XX

General convergent EM-Type algorithms73.1. Equivalence to EM Algorithms with a priori Information. In thissubsection, the EM-Type algorithms are shown to be equivalent to EM algorithmswith a priori information. The EM algorithm is a general approach for maximizing aposterior distribution when some of the data is missing [13]. It is an iterative methodwhich alternates between expectation (E) steps and maximization (M) steps. Forimage reconstruction, we assume that the missing data is the latent variables {zij },describing the intensity of pixel (or voxel) j observed by detector i. It is introducedas a realization of a multi-valued random variable Z, and for each (i, j) pair, zijfollows Pa Poisson distribution with expected value Ai,j xj . Then the observed datais yi zij , because the summation of two Poisson distributed random variablesjalso follows a Poisson distribution, whose expected value is the summation of thetwo expected values.The original E-step is to find the expectation of the log-likelihood given thepresent variables xk :Q(x xk ) Ez xk ,y log p(x, z y).Then, the M-step is to choose xk 1 to maximize the expected log-likelihood Q(x xk )found in the E-step:(13)xk 1 argmax Ez xk ,y log p(x, z y) argmax Ez xk ,y log(p(y, z x)p(x))xxX(zij log(Ai,j xj ) Ai,j xj ) βJ(x) argmax Ez xk ,yx argminxijX(Ai,j xj Ez xk ,y zij log(Ai,j xj )) βJ(x).ijFrom (13), what we need before solving it is just {Ez xk ,y zij }. Therefore we cankcomputeP the expectation of missing data {zij } given present x and the conditionyi zij , denoting this as an E-step. Because for fixed i, {zij } are PoissonjPvariables with mean {Ai,j xkj } andzij yi , the conditional distribution of zijj Ai,j xkjis the binomial distribution yi , (Axk )i . Thus we can find the expectation of zijwith all these conditions by the following E-stepk 1zij Ez xk ,y zij (14)Ai,j xkj yi.(Axk )iAfter obtaining the expectation for all zij , we can solve the M-step (13).We will show that EM-Type algorithms are exactly the described EM algorithmswith a priori information. Recalling the definition of xEM , we haveP k 1zijiEMPxj .Ai,jiTherefore, the M-step is equivalent toXk 1xk 1 argmin(Ai,j xj zijlog(Ai,j xj )) βJ(x)xijXX argmin(Ai,j )(xj xEMlog(xj )) βJ(x).jxInverse Problems and ImagingjiVolume X, No. X (20xx), X–XX

8Ming Yan, Alex A.T. Bui, Jason Cong and Luminita A. VeseWe have shown that EM-Type algorithms are EM algorithms with a priori information. The convergence of EM-Type algorithms is shown in the next subsection fromthe equivalence of EM-Type algorithms with alternating minimization methods forequivalent problems.3.2. EM-Type Algorithms are Alternating Minimization Methods. In thissection, we will show that these algorithms can also be derived from alternating minimization methods of other problems with variables x and z. The new optimizationproblems are X zijPminimize E (x, z) : zij log Ai,j xj zij βJ(x),x,zAi,j xjijX(15)zij yi , i 1, · · · , M,subject tojzij 0,i 1, · · · , M, j 1, · · · , N.Here E P is used again to define the new function. E P (·) means the negative loglikelihood function of x, while E P (·, ·) means the new function of x and z definedin new optimization problems. x 0 is implicitly included in the formula.Having initial guess x0 , z 0 of x and z, the iteration for k 0, 1, · · · is as follows:Xz k 1 argmin E P (xk , z),subject tozij yi ,zxk 1jP argmin E (x, zk 1).xFirstly, in order to obtain z k 1 , we fix x xk and easily derivek 1zij (16)Ai,j xkj yi.(Axk )iAfter finding z k 1 , let z z k 1 fixed and update x, then we have!k 1Xzijk 1Ai,j xj zijlogxk 1 argmin βJ(x)Ai,j xjxijX k 1 argminAi,j xj zijlog(Ai,j xj ) βJ(x),xijwhich is the M-Step (13) in section 3.1. Thus EM-Type algorithms for (9) arealternating minimization methods for problem (15). The equivalence of problems(9) and (15)

2. Expectation Maximization (EM) Iteration. A maximum likelihood (ML) method for image reconstruction based on Poisson data was introduced by Shepp and Vardi [42] in 1982 for image reconstruction in emission tomography. In fact, this algorithm was originally proposed by Richardson [37]

Related Documents:

DIVERGENT AND CONDITIONALLY CONVERGENT SERIES WHOSE PRODUCT IS ABSOLUTELY CONVERGENT* BY FLORIAN CAJOKI § 1. Introduction. It has been shown by Abel that, if the product : 71—« I( » V«-i M„«o) 71 0 of two conditionally convergent series : 71—0 71 — 0 is convergent, it converges to the product of their sums.

Expectation Maximization We first focus on the analysis of the convergenceproperties of the Expectation-Maximization (EM) algorithm. Con-sider a probabilistic model of observed data x which uses latent variables z. The log-likelihood (objective function

Machine Learning for Computer Vision Expectation-Maximization EM is an elegant and powerful method for MLE problems with latent variables Main idea: model parameters and latent variables are estimated iteratively, where average over the latent variables (expectation) A typical exam

Mathematical Expectation Properties of Mathematical Expectation I The concept of mathematical expectation arose in connection with games of chance. In its simplest form, mathematical expectation is the product of the amount a player stands to win and the probability that the player would win.

Maximum Likelihood (ML), Expectation Maximization (EM) Pieter Abbeel UC Berkeley EECS Many slides adapted from Thrun, Burgard and Fox, Probabilistic Robotics TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAAAAAAAAAAA!File Size: 2MB

Expectation-Maximization Algorithm and Applications Eugene Weinstein Courant Institute of Mathematical Sciences . Can prove is the maximum-likelihood estimate of θ Differentiate with respect to θ, set equal to 0. 5/31 EM Motivation So, to solve any ML-type p

The expectation maximization (EM) algorithm computes maximum likelihood (ML) estimates of unknown parameters in probabilistic models involving latent avriables Z1. An instructive way of thinking about EM is to think of it as a systematic way o

A Reader’s Guide to Contemporary Literary Theoryis a classic introduction to the ever-evolving field of modern literary theory, now expanded and updated in its fifth edition. This book presents the full range of positions and movements in contemporary literary theory. It organises the theories into clearly defined sections and presents them in an accessible and lucid style. Students are .