Fast Inexact Subspace Iteration For Generalized Eigenvalue Problems .

1y ago
4 Views
3 Downloads
504.61 KB
22 Pages
Last View : 3m ago
Last Download : 3m ago
Upload by : Camryn Boren
Transcription

Technical Report, February 2010FAST INEXACT SUBSPACE ITERATION FOR GENERALIZEDEIGENVALUE PROBLEMS WITH SPECTRAL TRANSFORMATION FEI XUE† AND HOWARD C. ELMAN‡Abstract. We study inexact subspace iteration for solving generalized non-Hermitian eigenvalueproblems with spectral transformation, with focus on a few strategies that help accelerate preconditioned iterative solution of the linear systems of equations arising in this context. We provide newinsights into a special type of preconditioner with “tuning” that has been studied for this algorithmapplied to standard eigenvalue problems. Specifically, we propose an alternative way to use the tunedpreconditioner to achieve similar performance for generalized problems, and we show that these performance improvements can also be obtained by solving an inexpensive least squares problem. Inaddition, we show that the cost of iterative solution of the linear systems can be further reducedby using deflation of converged Schur vectors, special starting vectors constructed from previouslysolved linear systems, and iterative linear solvers with subspace recycling. The effectiveness of thesetechniques is demonstrated by numerical experiments.Key words. inexact subspace iteration, tuned preconditioner, deflation, subspace recycling,starting vectorDedicated to G. W. Stewart on the occasion of his 70th birthday.1. Introduction. In many scientific applications, one needs to solve the generalized non-Hermitian eigenvalue problem Av λBv for a small group of eigenvaluesclustered around a given shift or with largest real parts. The most commonly usedapproach is to employ a spectral transformation to map these eigenvalues to extremalones so that they can be easily captured by eigenvalue algorithms. The major difficultyof the spectral transformation is that a linear system of equations involving a shiftedmatrix A σB must be solved in each (outer) iteration of the eigenvalue algorithm. Ifthe matrices are so large (for example, those from discretization of three-dimensionalpartial differential equations) that direct linear solvers based on factorization are tooexpensive to apply, we have to use iterative methods (inner iteration) to solve theselinear systems to some prescribed tolerances. Obviously, the effectiveness of the wholealgorithm with “inner-outer” structure depends st rongly on that of the inner iteration. In this paper, we study a few techniques that help accelerate the iterativesolution of the linear systems that arise when inexact subspace iteration is used tosolve the generalized non-Hermitian eigenvalue problem with spectral transformation.In recent years, great progress has been made in analyzing inexact algorithms,especially inexact inverse iteration, to compute a simple eigenvalue closest to someshift. References [18, 21] establish the linear convergence of the outer iteration for nonHermitian problems, assuming that the algorithm uses a fixed shift and a sequence ofdecreasing tolerances for the solution of the linear systems. Inexact Rayleigh quotientiteration for symmetric matrices is studied in [33] and [27], where the authors explorehow the inexactness of the solution to the linear systems affects the convergence ofthe outer iteration. Systematic analysis of this algorithm is given by Spence and hiscollaborators (see [2, 3, 4, 13, 14]) on the relation between the inner and the outeriterations, with different formulations of the linear systems, and variable shifts andtolerances for solving the linear systems. To make the inner iteration converge morequickly, [32] provides new perspectives on preconditioning by modifying the right handside of the preconditioned system. This idea is elaborated on in [2, 3, 4] and further This work was supported by the U. S. Department of Energy under grant DEFG0204ER25619,and by the U. S. National Science Foundation under grant CCF0726017.† Applied Mathematics and Scientific Computation Program, Department of Mathematics, University of Maryland, College Park, MD 20742. (petrinet@math.umd.edu).‡ Department of Computer Science and Institute for Advanced Computer Studies, University ofMaryland, College Park, MD 20742. (elman@cs.umd.edu)1

2refined in [13, 14] where a special type of preconditioner with “tuning” is defined andshown to greatly reduce the inner iteration counts.Inexact subspace iteration is a straightforward block extension of inexact inverseiteration with a fixed shift. Robbé et al. [29] establish linear convergence of theouter iteration of this algorithm for standard eigenvalue problems and show by theblock-GMRES [31] convergence theory that tuning keeps the block-GMRES iterationcounts roughly constant for solving the block linear systems, though the inner solve isrequired to be done with increasing accuracy as the outer iteration proceeds. In thispaper, this idea is extended to generalized problems and is improved by a new twophase algorithm. Specifically, we show that tuning can be limited to just one step ofpreconditioned block-GMRES to get an approximate solution, after which a correctionequation can be solved to a fixed relative tolerance with proper preconditioned blocklinear solvers where tuning is not needed. We show that the effect of tuning is toreduce the residual in a special way, and that this effect can be also achieved by othermeans, in particular by solving a small least squares problem. Moreover, we show thatthe two-phase strategy is closely related to an inverse correction scheme presented in[30, 18] and the residual inverse power method in [35].The second phase of this algorithm, in addition to using a simplified preconditioning strategy, can also be simplified in other ways to achieve additional reductionof inner iteration cost. We explore three techniques to attain the extra speedup:1. Deflation of converged Schur vectors (see [34]) – Once some Schur vectorshave converged, they are deflated from the block linear systems in subsequentouter iterations, so that the block size becomes smaller. This approach isindependent of the way the block linear systems are solved.2. Special starting vector – We find that the right hand sides of a few successivecorrection equations are often close to being linearly dependent; therefore anappropriate linear combination of the solutions to previously solved correctionequations can be used as a good starting vector for solving the current one.3. Subspace Recycling – Linear solvers with recycled subspaces (see [28]) can beused to solve the sequence of correction equations, so that the search spacefor each solve does not need to be built from scratch. In addition, if the samepreconditioner is used for all correction equations, the recycled subspacesavailable from solving one equation can be used directly for the next withoutbeing transformed by additional preconditioned matrix-vector products.We discuss the effectiveness of these ideas and show by numerical experiments thatthey generally result in significant savings in the number of preconditioned matrixvector products performed in inner iterations.An outline of the paper is as follows. In Section 2, we describe the inexactsubspace iteration for generalized non-Hermitian eigenvalue problems, restate somepreliminary results taken from [29] about block decomposition of matrices, and discussa new tool for measuring closeness of two subspaces. In Section 3, we briefly discussthe behavior of unpreconditioned and preconditioned block-GMRES without tuningfor solving the block linear systems arising in inexact subspace iteration, and presentnew insights into tuning that lead to our two-phase strategy to solve the block linearsystems. In Section 4, we discuss deflation of converged Schur vectors, special startingvector and linear solvers with recycled subspaces and the effectiveness of the combineduse of these techniques for solving the block systems. Section 5 includes a series ofnumerical experiments to show the performance of our algorithm for problems fromMatrix Market [23] and those arising from linear stability analysis of models of twodimensional incompressible flows. We finally draw conclusions in Section 6.2. Inexact Subspace Iteration and Preliminary Results. In this section,we review inexact subspace iteration for the generalized non-Hermitian eigenvalue

3problem Av λBv (A, B Cn n ) with spectral transformation, block Schur andeigen-decomposition of matrices, and metrics that measure the error of the currentapproximate invariant subspace.2.1. Spectral Transformation and Inexact Subspace Iteration. To betterunderstand the algorithm, we start with the definition of the shift-invert and thegeneralized Cayley transformation (see [24]) as follows:(2.1)1v (shift-invert)λ σλ σ2Av λBv (A σ1 B) 1 (A σ2 B)v v (generalized Cayley)λ σ1Av λBv (A σB) 1 Bv The shift-invert transformation maps eigenvalues closest to σ to dominant eigenvaluesof A (A σB) 1 B; the generalized Cayley transformation maps eigenvalues to the2to eigenvalues of A (A σ1 B) 1 (A σ2 B) outsideright of the line (λ) σ1 σ2the unit circle, and those to the left of this line to ones inside the unit circle (assumingthat σ1 σ2 ). The eigenvalues of A with largest magnitude can be easily found byiterative eigenvalue algorithms. Once the eigenvalues of the transformed problem arefound, they are transformed back to those of the original problem. Note that theeigenvectors do not change with the transformation.Without loss of generality, we consider using A A 1 B for inexact subspaceiteration to compute k eigenvalues of Av λBv with smallest magnitude (i.e., kdominant eigenvalues of A A 1 B). This notation covers both types of operators inb A σ1 B and Bb A σ2 B,(2.1) with arbitrary shifts. For example, one can let A 1 bbso that the generalized Cayley operator is A A B. The algorithm is as follows:Algorithm 1: Inexact Subspace Iteration with A A 1 BGiven δ 0 and X (0) Cn p with X (0) X (0) I (k p)For i 0,1,., until k Schur vectors converge1. Compute the error e(i) sin (AX (i) , BX (i) )2. Solve AY (i) BX (i) inexactly such that the relative residual normkBX (i) AY (i) kkBX (i) k δe(i)3. Perform the Schur-Rayleigh-Ritz procedure to get X (i 1) withorthonormal columns from Y (i) and test for convergenceEnd ForIn Step 1, X (i) is the space spanned by the current outer iterate X (i) . The errorof X (i) is defined as the sine of the largest principal angle between AX (i) and BX (i) .It decreases to zero as X (i) converges to an invariant subspace of the matrix pencil(A, B). This error can be computed by MATLAB’s function subspace based onsingular value decomposition (see Algorithm 12.4.3 of [17]), and will be discussed indetail in Proposition 2.2.The Schur-Rayleigh-Ritz (SRR) procedure (see Chapter 6.1 of [34]) in Step 3 willbe applied to deflate converged Schur vectors. The procedure and its use for deflationwill be explained in detail in Section 4. The most computationally expensive part ofthe algorithm is Step 2, which requires an inexact solve of the block linear systemAY (i) BX (i) . The major concern of this paper is to reduce the cost of this solve.2.2. Block eigen-decomposition. To briefly review the basic notations anddescription of the generalized eigenvalue problem, we restate some results from [29]on block eigen-decomposition of matrices and and study a new tool to measure theerror of X (i) for generalized problems. To simplify our exposition, we assume that B isnonsingular. This assumption is valid for problems arising in a variety of applications.In addition, though B is only positive semi-definite in linear stability analysis of

4incompressible flows, one can instead solve some related eigenvalue problems withnonsingular B that share the same finite eigenvalues as the original problem; see [5]for details.As B is nonsingular, we assume that the eigenvalues of B 1 A are ordered so that0 λ1 λ2 . λp λp 1 . λn .The Schur decomposition of B 1 A can be written in block form as T11 T12 (2.2)B 1 A V1 , V1 V1 , V1 ,0 T22 where V1 , V1 is a unitary matrix with V1 Cn p and V1 Cn (n p) , T11 Cp pand T22 C(n p) (n p) are upper triangular, λ(T11 ) {λ1 , ., λp } and λ(T22 ) {λp 1 , ., λn }. Since T11 and T22 have disjoint spectra, there is a unique solutionQ Cp (n p) to the Sylvester equation QT22 T11 Q T12 (see Section 1.5, Chapter1 of [34]). Then B 1 A can be transformed to block-diagonal form as follows: I Q I Q T110 1 (2.3)V1 , V1 B A V 1 , V10I0 T220 I T11 0 V1 , (V1 Q V1 )(V1 V1 Q ), V1 0 T22 T11 0 1 (V1 V1 Q ), V1 QD V1 , (V1 Q V1 )QD0 QD T22 Q 1D K 0K 0 1 [V1 , V2 ][W1 , W2 ] [V1 , V2 ][V1 , V2 ]0 M0 Mwhere QD (I Q Q)1/2 , V2 (V1 Q V1 )Q 1D with orthonormal columns, K T11 , withthesamespectrumasTM QD T22 Q 122 , W1 V1 V1 Q and W2 V1 QD D 1such that W1 , W2 V1 , V2. From the last expression of (2.3), we have(2.4)AV1 BV1 K,and AV2 BV2 M.Recall that we want to compute V1 and corresponding eigenvalues (the spectrum ofK) by inexact subspace iteration.2.3. Tools to measure the error. The basic tool to measure the deviationof X (i) span{X (i) } from V1 span{V1 } is the sine of the largest principal anglebetween X (i) and V1 defined as (see [29] and references therein)(2.5)sin(X (i) , V1 ) k(V1 ) X (i) k kX (i) (X (i) ) V1 V1 k min kX (i) V1 Zk min kV1 X (i) Zk.Z Cp pZ Cp pThis definition depends on the fact that both X (i) and V1 have orthonormal columns.We assume that X (i) has the following decomposition(2.6)X (i) V1 C (i) V2 S (i) ,where C (i) W1 X (i) Cp p , S (i) W2 X (i) C(n p) p . Intuitively, kS (i) k 0 asX (i) V1 . Properties of C (i) and the equivalence of several metrics are given in thefollowing proposition.Proposition 2.1. (PROPOSITION 2.1 in [29]) Suppose X (i) is decomposed as in(2.6). Let s(i) kS (i) (C (i) ) 1 k and t(i) s(i) kC (i) k. Then1) C (i) is nonsingular and thus t(i) is well-defined. The singular values of C (i) satisfy(2.7)0 1 kS (i) k σk (C (i) ) 1 kS (i) k, k 1, 2, ., p

5and C (i) U (i) Υ(i) , where U (i) is unitary and kΥ(i) k kS (i) k 1.1 kS (i) kkS (i) k2a) sin(X (i) , V1 ) kS (i) k s(i) 1 kS(i) k(i)kS k2b) sin(X (i) , V1 ) t(i) 1 kS(i) kp(i)22c) kS k 1 kQk sin(X (i) , V1 ).The proposition states that as X (i) V1 , C (i) gradually approximates a unitarymatrix, and sin(X (i) , V1 ), kS (i) k, s(i) and t(i) are essentially equivalent measures ofthe error. These quantities are not computable since V1 is not available. However,the computable quantity sin(AX (i) , BX (i) ) in Step 1 of Algorithm 1 is equivalent tokS (i) k, as the following proposition shows.Proposition 2.2. Let X (i) be decomposed as in (2.6). Thenc1 kS (i) k sin(AX (i) , BX (i) ) c2 kS (i) k,(2.8)where c1 and c2 are constants independent of the progress of subspace iteration.Proof. We first show that as X (i) V1 , AX (i) BX (i) . In fact, from (2.4) and(2.6) we have(2.9)BX (i) BV1 C (i) BV2 S (i) AV1 K 1 C (i) AV2 M 1 S (i) A(X (i) V2 S (i) )(C (i) ) 1 K 1 C (i) AV2 M 1 S (i) AX (i) (C (i) ) 1 K 1 C (i) AV2 S (i) (C (i) ) 1 K 1 C (i) M 1 S (i) .Roughly speaking, AX (i) and BX (i) can be transformed to each other by postmultiplying (C (i) ) 1 K 1 C (i) or its inverse, with a small error proportional to kS (i) k.(i)(i)Let DA (X (i) A AX (i) ) 1/2 , DB (X (i) B BX (i) ) 1/2 Cp p , so that(i) (i)(i) (i)both AX DA and BX DB have orthonormal columns. Then by (2.5)(i)(i)sin (AX (i) , BX (i) ) min kAX (i) DA BX (i) DB ZkZ Cp p(i)(i)(i) (i) AX (i) DA BX (i) DB (DB ) 1 (C (i) ) 1 KC (i) DA (i) AX (i) BX (i) (C (i) ) 1 KC (i) DA (i) BV2 M S (i) S (i) (C (i) ) 1 KC (i) DA(see (2.9) and (2.4))(2.10)(i) kBV2 kkDA kkSi kkS (i) k,where Si is the Sylvester operator G Si (G) : M G G(C (i) ) 1 KC (i) . Note that asi increases, kSi k kSk where S : G S(G) M G GK. In fact,(2.11)kM G G(C (i) ) 1 KC (i) k(G C(n p) p )kGkG M G(C (i) ) 1 G(C (i) ) 1 K C (i) , supk G(C (i) ) 1 C (i) kGkSi k supand therefore(2.12)supG̃orkM G̃ G̃Kkκ(C (i) )kM G̃ G̃Kk kSi k supkG̃kκ(C (i) )kG̃kG̃(G̃ C(n p) p )kSk/κ(C (i) ) kSi k kSkκ(C (i) ).As 1 κ(C (i) ) 1 kS (i) k1 kS (i) k(Proposition 2.1, 2a)) and kS (i) k 0, kSi k kSk(i)follows. Also, note that the extremal singular values of DA are bounded by those of

6(i)(A A) 1/2 (equivalently, those of A 1 ), so that in particular kDA k is bounded aboveby a constant independent of i. The upper bound in (2.8) is thus established.To study the lower bound, we have(2.13)(i)(i)sin (AX (i) , BX (i) ) minkAX (i) DA BX (i) DB ZkZ Cp p (i)(i)(i)B B 1 AX (i) X (i) DB Z(DA ) 1 DA minp pZ C(i) minkB 1 AX (i) X (i) Zkσmin (B)σmin (DA ).p pZ C(i)Let σ (i) σmin (B)σmin (DA ) σmin (B)kX (i) A AX (i) k 1/2 . Again using the(i)boundedness of the singular values of DA , it follows that σ (i) is bounded belowby a constant independent of i. Since the minimizer Z in the inequality (2.13) isK (i) X (i) B 1 AX (i) (see Chapter 4, Theorem 2.6 of [34]), we have(2.14) sin (AX (i) , BX (i) ) σ (i) kB 1 AX (i) X (i) K (i) k σ (i) k(V1 ) (B 1 AX (i) X (i) K (i) )k σ (i) kT22 (V1 ) X (i) (V1 ) X (i) K (i) k( kV1 k 1 )( (V1 ) B 1 A T22 (V1 ) ; see (2.2) ) σ (i) sep(T22 , K (i) )k(V1 ) X (i) k σ (i) sep(T22 , K (i) ) sin (V1 , X (i) ) σ (i) sep(T22 , K (i) )(1 kQk2 ) 1/2 kS (i) k.(Proposition 2.1, 2c))Moreover, as X (i) V1 , K (i) X (i) B 1 AX (i) K up to a unitary transformation,and hence sep(T22 , K (i) ) sep(T22 , K) (see Chapter 4, Theorem 2.11 in [34]). Thisconcludes the proof. Remark 2.1. In [29] the authors use kAX (i) X (i) X (i) AX (i) k to estimatethe error of X (i) for standard eigenvalue problems, and show that this quantity isessentially equivalent to kS (i) k. The p p matrix X (i) AX (i) is called the (block)Rayleigh quotient of A (see Chapter 4, Definition 2.7 of [34]). For the generalizedeigenvalue problem, we have not seen an analogous quantity in the literature. However, Proposition 2.2 shows that sin (AX (i) , BX (i) ) is a convenient error estimate.3. Convergence Analysis of Inexact Subspace Iteration. We first demonstrate the linear convergence of the outer iteration of Algorithm 1, which follows fromTheorem 3.1 in [29]. In fact, replacing A 1 by A 1 B in the proof therein, we have(3.1)t(i 1) kKkkM 1 kt(i) kW2 B 1 k k(C (i) ) 1 k kR(i) k,1 kW1 B 1 k k(C (i) ) 1 k kR(i) kwhere k(C (i) ) 1 k 1/σmin (C (i) ) 1, and(3.2)kR(i) k kBX (i) AY (i) k δkBX (i) k sin (AX (i) , BX (i) )p(see Proposition 2.1) δC2 kBX (i) k 1 kQk2 t(i) .Thus X (i) V1 linearly for kKkkM 1 k 1 and small enough δ.In this section, we investigate the convergence of unpreconditioned and preconditioned block-GMRES without tuning for solving AY (i) BX (i) to the prescribedtolerance, and provide new perspectives on tuning that lead to a new two-phase strategy for solving this block system.3.1. Unpreconditioned and preconditioned block-GMRES with no tuning. The block linear systems AY (i) X (i) arising in inexact subspace iteration forstandard eigenvalue problems are solved to increasing accuracy as the outer iterationprogresses. It is shown in [29] that when unpreconditioned block-GMRES is used

7for these systems, iteration counts remain roughly constant during the course of thiscomputation, but for preconditioned block-GMRES without tuning, the number ofiterations increases with the outer iteration. The reason is that the right hand sideX (i) is an approximate invariant subspace of the system matrix A, whereas with preconditioning, there is no reason for X (i) to bear such a relation to the preconditionedsystem matrix AP 1 (assuming right preconditioning is used).For generalized eigenvalue problems, however, both the unpreconditioned and preconditioned block-GMRES iteration counts increase progressively. To see this point,we study the block spectral decomposition of the system matrix and right hand sideand review the block-GMRES convergence theory. We present the analysis of theunpreconditioned solve; the results apply verbatim to the preconditioned solve.We first review a generic convergence result of block-GMRES given in [29]. Let Gbe a matrix of order n where the p smallest eigenvalues are separated from its othern p eigenvalues. As in (2.3), we can block diagonalize G as KG 10(3.3)G VG1 , VG2,VG1 , VG20MGwhere VG1 Cn p and VG2 Cn (n p) have orthonormal columns, λ(KG ) are thep smallest eigenvalues of G, and λ(MG ) are then other eigenvalues of G.o Recall theGz: z Cn p , z 6 0 and the definitions of the numerical range W (MG ) z zM zpseudospectrum λ (MG ) {λ C : σmin (λI MG ) }. The role of the right handside in the convergence of block-GMRES is described in the following lemma, whichfollows immediately from Theorem 3.7 of [29].Lemma 3.1. Assume the numerical range W (MG ) or the -pseudospectrumλ (MG ) is contained in a convex closed bounded set E in the complex plane with 0 / E.Suppose block-GMRES is used to solve GY Z where Z Cn p can be decomposedas Z VG1 CG VG2 SG with VG1 and VG2 given in (3.3). Here SG C(n p) p , andwe assume CG Cp p is nonsingular. Let Yk be the approximate solution of GY Zobtained in the k-th block-GMRES iteration with Y0 0. If(3.4)thenk 1 CakZ GYk kkZk 1kCG kkSG CGkCb logkZkτ , τ . Here Ca and Cb are constants that depend on the spectrum of G.Remark 3.1. For details about Ca and Cb , see [29], [19] and [12]. These detailshave minimal impact on our subsequent analysis.Remark 3.2. This generic convergence result can be applied to any specific blocklinear systems with or without preconditioning. For example, to study the behaviorof unpreconditioned block-GMRES for solving AY (i) BX (i) , let G A in (3.3) andLemma 3.1, and decompose relevant sets of column vectors in terms of VA1 and VA2 .In the following, we will assume that for nontrivial B (B 6 I), there is no pathological or trivial connection between the decomposition of (3.3) for the case G A andthat of B 1 A in (2.3). Specifically, we assume there exist a nonsingular C1 Cp pand a full rank S1 C(n p) p such that(3.5)V1 VA1 C1 VA2 S1 .This assumption is generally far from stringent in practice. Similarly, let the decomposition of V2 be V2 VA1 C2 VA2 S2 with C2 Cp (n p) and S2 C(n p) (n p) .Lemma 3.1 and the assumptions above lead to the following theorem, which givesqualitative insight into the behavior of block-GMRES for solving AY (i) BX (i) .

8Theorem 3.2. Assume that unpreconditioned block-GMRES is used to solvethe linear system AY (i) BX (i) to the prescribed tolerance in Step 2 of Algorithm1. If the assumption (3.5) holds, and X (i) V1 linearly, then the lower bound onblock-GMRES iterations from Lemma 3.1 increases as the outer iteration proceeds.Proof. With (2.4), (2.6), (3.3) and (3.5), we have(3.6)BX (i) BV1 C (i) BV2 S (i) AV1 K 1 C (i) AV2 M 1 S (i) A(VA1 C1 VA2 S1 )K 1 C (i) A(VA1 C2 VA2 S2 )M 1 S (i) (VA1 KA C1 VA2 MA S1 )K 1 C (i) (VA1 KA C2 VA2 MA S2 )M 1 S (i) VA1 KA (C1 K 1 C (i) C2 M 1 S (i) ) VA2 MA (S1 K 1 C (i) S2 M 1 S (i) )(i)(i) VA1 CA VA2 SA .Since kS (i) k 0 and σk (C (i) ) 1(k 1, 2, ., p) (see Proposition 2.1, 1)), we have(i)(i)(i) 1kCA k kKA C1 K 1 k and kSA (CA ) 1 k kMA S1 C1 1 KAk, both of which arenonzero under our assumption (3.5).From Step 2 of Algorithm 1 and (2.10), the relative tolerance for AY (i) BX (i) is(i)τ δ sin(AX (i) , BX (i) ) δkBV2 kkDA kkSi kkS (i) k 0. Then from (3.4) and (3.6),the lower bound of the unpreconditioned block-GMRES iteration counts needed forsolving AY (i) BX (i) to the prescribed tolerance is!(i)(i)(i) 1(C)kkkSkCAAA(3.7)k (i) 1 Ca Cb log.(i)δkBX (i) kkBV2 kkDA kkSi kkS (i) k(i)Note that kBX (i) k kBV1 k, kDA k k(V1 A AV1 ) 1/2 k and kSi k kSk (seethe proof of Proposition 2.2). Therefore all terms in the argument of the logarithmoperator approach some nonzero limit except kS (i) k 0, and hence the expressionon the right in (3.7) increases as the outer iteration proceeds.Remark 3.3. This result only shows that a lower bound on k (i) increases; it doesnot establish that there will be growth in actual iteration counts. However, numericalexperiments described in [29] and Section 5 (see Figures 5.1–5.2) show that this resultis indicative of performance. The fact that the bound progressively increases dependson the assumption of (3.5) that V1 has “regular” components of VA1 and VA2 . Thisassumption guarantees that BX (i) does not approximate the invariant subspace VA1of A. The proof also applies word for word to the preconditioned solve without tuning:one only needs to replace (3.3) by the decomposition of the preconditioned systemmatrix AP 1 and write BX (i) in terms of the invariant subspace of AP 1 .3.2. Preconditioned block-GMRES with tuning. To accelerate the iterative solution of the block linear system arising in inexact subspace iteration, [29]proposes and analyzes a new type of preconditioner with tuning. Tuning constructs aspecial low-rank update of the existing preconditioner, so that the right hand side ofthe preconditioned system is an approximate eigenvector or invariant subspace of thepreconditioned system matrix with tuning. Specifically, the tuned preconditioner is(3.8)P(i) P (AX (i) P X (i) )X (i) ,from which follow P(i) X (i) AX (i) and A(P(i) ) 1 (AX (i) ) AX (i) . In other words,AX (i) is an invariant subspace of A(P(i) ) 1 with eigenvalue 1. Intuitively, as X (i) V1 , we have X (i) AX (i) for the standard eigenvalue problem, or BX (i) AX (i)for the generalized problem. Therefore, the right hand side of A(P(i) ) 1 Ỹ (i) X (i)or A(P(i) ) 1 Ỹ (i) BX (i) (with Y (i) (P(i) ) 1 Ỹ (i) ) spans an approximate invariantsubspace of A(P(i) ) 1 . The difficulty of block-GMRES without tuning discussed in

9subsection 3.1 is thus resolved, and the block-GMRES iteration counts with tuningdo not increase with the progress of the outer iteration (see Theorem 4.5 of [29]).The matrix-vector product involving (P(i) ) 1 is built from that for P 1 using theSherman-Morrison-Woodbury formula as follows: 1 (i) 1P .(3.9)(P(i) ) 1 I (P 1 AX (i) X (i) ) X (i) P 1 AX (i)XNote that P 1 AX (i) X (i) and X (i) P 1 AX (i) can be computed before the blockGMRES iteration. In each block-GMRES step, (P(i) ) 1 requires an additional p2inner products of vectors of length n, a dense matrix-matrix division of size p p, anda multiplication of a dense matrix of size n p with a p p matrix. This extra costis relatively small in general, but it is not free.We now provide a new two-phase algorithm for solving AY (i) BX (i) , whichessentially eliminates the overhead of tuning but keeps the block-GMRES iterationcounts from progressively increasing. The strategy provides some new perspectiveson the use of tuning. In addition, we will discuss some connections between thisalgorithm and the methods in [30, 18] and [35].Algorithm 2: Two-phase strategy for solving AY (i) BX (i)1. Apply a single step of preconditioned block-GMRES with tuning to get(i)an approximate solution Y1(i)2. Solve the correction equation A dY (i) BX (i) AY1 with proper(i)preconditioned iterative solver to get an approximate solution dYk , so(i)(i)that Yk 1 Y1(i) dYk(i)satisfieskBX (i) AYk 1 kkBX (i) k δ sin(AX (i) , BX (i) ).Note in particular that tuning need not be used to solve the correction equation,and thus we can work with a fixed preconditioned system matrix for the correctionequation in all outer iterations.Obviously, Phase II can be equivalently stated as follows: solve AY (i) BX (i)(i)with proper preconditioned iterative solver and starting vector Y1 from Phase I.The phrasing in Algorithm 2 is intended to illuminate the connection between thisstrategy and the methods in [30, 18] and [35].The analysis of Algorithm 2 is given in the following main theorem. For this,we make some assumptions concerning the right hand side of the correction equationanalogous to the assumption made for (3.5): with(i)BX (i) AY1(3.10)(i)(i) VA1 Cceq VA2 Sceq,(i) 1 (i)where VA1 and VA2 have orthonormal columns, we assume that kSceq (Cceq) k O(1), and that kCceq k is proportional to(i)(i) 1 (i)kCceqk kSceq(Cceq) k(i)kBX (i) AY1 k(i)kBX (i) AY1 k.This implies that the term, which appears in (3.4), does not depend on i. We have no proofof these assumptions, but they are consistent with all our numerical experience. Inthe subsequent derivation, let ej Rp be a standard unit basis vector with 1 in entryj and zero in other entries.Theorem 3.3. Suppose the two-phase strategy (Algorithm 2) is used to solve(i)AY (i) BX (i) . Then Y1 X (i) (C (i) ) 1 K 1 C (i) F (i) where F [f1 , . . . , fp ]with fj argminf Cp kBX (i) ej A(P(i) ) 1 BX (i) f k and k (i) k O(kS (i) k). In(i)addition, the residual norm kBX (i) AY1 k O(kS (i) k). Thus, if block-GMRES isused to solve the correction equation, the inner iteration counts will not increase withthe progress of the outer iteration.

10Proof. The approximate solution to A(P(i) ) 1 Ỹ BX (i) in the k-th blockGMRES iteration starting with zero starting vector is(i)(3.11)Ỹk span{BX (i) , A(P(i) ) 1 BX (i) , ., A(P(i) ) 1It follows from (2.9) and (P(i) ) 1 AX (i) X (i) that(i) k 1BX (i) }.(i) (P(i) ) 1 Ỹ1 (P(i) ) 1 BX (i) F (P(i) ) 1 AX (i) (C (i) ) 1 K 1 C (i) AV2 S (i) (C (i) ) 1 K 1 C (i) M 1 S (i) F X (i) (C (i) ) 1 K 1 C (i) F (P(i) ) 1 AV2 M 1 S (i) S (i) (C (i) ) 1 K 1 C (i) F(3.12) Y1 X (i) (C (i) ) 1 K 1 C (i) F (i) ,where the jth column of F Cp p minimizes kBX (i) ej A(P(i) ) 1

Matrix Market [23] and those arising from linear stability analysis of models of two-dimensional incompressible flows. We finally draw conclusions in Section 6. 2. Inexact Subspace Iteration and Preliminary Results. In this section, we review inexact subspace iteration for the generalized non-Hermitian eigenvalue

Related Documents:

techniques and has recently been voted among the top ten algorithms in data mining [5]. Figure 1(d) (top) displays the clustered subspace found by ORT exhibiting three distinct clusters. See below the noise subspace which is orthogonal to the clustered subspace. It is essential to require the noise subspace to be uni-modal. This ensures that it .

Bruksanvisning för bilstereo . Bruksanvisning for bilstereo . Instrukcja obsługi samochodowego odtwarzacza stereo . Operating Instructions for Car Stereo . 610-104 . SV . Bruksanvisning i original

The ef-ficiency of Uzawa-type algorithms depend on the choice of preconditioners and relaxed parame-ters. In this paper, we propose a strategy for choosing the preconditioners in the inexact Uzawa algorithm, which results in a special parameterized inexact Uzawa method (GPIUS). The suff

10 tips och tricks för att lyckas med ert sap-projekt 20 SAPSANYTT 2/2015 De flesta projektledare känner säkert till Cobb’s paradox. Martin Cobb verkade som CIO för sekretariatet för Treasury Board of Canada 1995 då han ställde frågan

service i Norge och Finland drivs inom ramen för ett enskilt företag (NRK. 1 och Yleisradio), fin ns det i Sverige tre: Ett för tv (Sveriges Television , SVT ), ett för radio (Sveriges Radio , SR ) och ett för utbildnings program (Sveriges Utbildningsradio, UR, vilket till följd av sin begränsade storlek inte återfinns bland de 25 största

Hotell För hotell anges de tre klasserna A/B, C och D. Det betyder att den "normala" standarden C är acceptabel men att motiven för en högre standard är starka. Ljudklass C motsvarar de tidigare normkraven för hotell, ljudklass A/B motsvarar kraven för moderna hotell med hög standard och ljudklass D kan användas vid

LÄS NOGGRANT FÖLJANDE VILLKOR FÖR APPLE DEVELOPER PROGRAM LICENCE . Apple Developer Program License Agreement Syfte Du vill använda Apple-mjukvara (enligt definitionen nedan) för att utveckla en eller flera Applikationer (enligt definitionen nedan) för Apple-märkta produkter. . Applikationer som utvecklas för iOS-produkter, Apple .

Pradeep Sharma, Ryan P. Lively, Benjamin A. McCool and Ronald R. Chance. 2 Cyanobacteria-based (“Advanced”) Biofuels Biofuels in general Risks of climate change has made the global energy market very carbon-constrained Biofuels have the potential to be nearly carbon-neutral Advanced biofuels Energy Independence & Security Act (EISA) requires annual US production of 36 .