Least Squares Approximation/Optimization - CSE SERVICES

1y ago
15 Views
3 Downloads
1.18 MB
30 Pages
Last View : 2d ago
Last Download : 3m ago
Upload by : Braxton Mach
Transcription

Computational MethodsLeast SquaresApproximation/Optimization Manfred Huber 20111

Least Squaresn Least squares methods are aimed at findingapproximate solutions when no precise solutionexistsn n Find the solution that minimizes the residual error in thesystemLeast squares can be used to fit a model to noisydata points or to fit a simpler model to complexdatan Amounts to projecting higher dimensional data onto a lowerdimensional space. Manfred Huber 20112

Linear Least Squaresn Linear least squares attempts to find a leastsquares solution for an overdetermined linearsystem (i.e. a linear system described by an m x nmatrix A with! more equations than parameters).!Ax " bn Least squares minimizes the squared Eucliden norm of!!2!2the residualmin x! b " Ax min x! r 22!n For data fitting on m data points using a linearcombination of basis functions this corresponds tomnmin" % (y m # % " j j (x i )) 2i 1j 1! Manfred Huber 20113

Existence and Uniquenessn n Linear least squares problem always has a solutionSolution is unique if and only if A has full rank, i.e.rank(A) nn n If rank(A) n then A is rank-deficient and the solutionof the least squares problem is not uniqueIf solution is unique the residual vector can beexpressed through A2r 2 rT r (b " Ax)T (b " Ax) bT b " 2x T AT b x T AT Axn This is minimized if its derivative is 0 Manfred Huber 2011!"2AT b 2AT Ax 04

Normal Equationsn Optimization reduces to a n x n system of (linear)normal equationsTTA Ax A bn n Linear least squares can be found by solving this systemof linear equationsSolution can also be found through the Pseudo InverseAx " b # x A b!A (AT A) 1 ATn Condition number with respect to A can be expressed asin the case of solving linear equations! Manfred Huber 2011cond(A) A2A 25

Data Fittingn A common use for linear least squares solution is tofit a given type of function to noisy data pointsn nth order polynomial fit using monomial basis:#1 x1%! 1 x2A" %%# #% 1 x mn !# y1 &" x1n &% (n(" x2 ( ! % y2 (" %# ( # (% (n(" xm ' ym 'Solving the system of equations provides the best fit in terms ofthe Euclidian normTT Manfred Huber 2011A A" A y6

Condition Number andSensitivityn Sensitivity also depends on bn Influence of b can be expressed in terms of an anglebetween b and ycos(" ) n y2b2 Axb22Bound on the error in the solution due to perturbation inb can be expressed as!"xx Manfred Huber 201122"b 21# cond(A)cos( ) b 27

Condition Number andSensitivityn Bound on the error in the solution due toperturbation in A can be expressed as"xxn !n 222# ((cond(A)) tan( ) cond(A))"AA22For small residuals the condition number for least squaresis approximately cond(A).For large residuals the condition number can be square ofworse Manfred Huber 20118

Condition Number andSensitivityn Conditioning of normal equation solution isTcond(A A) (cond(A))n !n 2For large systems of equation the condition number ofthe formulation using normal equations (or thePseudoinverse) increases rapidlyMuch of the increased sensitivity is due to the need formultiplying A and AT in order to be able to apply asolution algorithm for the system of equationsn n Conditioning of the normal equations is potentially significantlyworse than the conditioning of the original systemAlgorithm is not very stable for large numbers of equations/datapoints Manfred Huber 20119

Augmented System Methodn Augmented system method transforms least squareproblem into an system of equation solvingproblem by adding equations and can be used toimprove the conditioningn Increase matrix size to a square (m n)x(m n) matrix byincluding the residual equationsr Ax bAT r 0n n "# I% T AA&# r & # b&(% ( % (0 ' x ' 0'Greater freedom to choose pivots and maintain stabilitySubstentially higher complexity for systems with m n due to theneed to solve a (n m)x(n m) system Manfred Huber 2011!10

QR Factorizationn n The Augmented System method addresses stabilitybut adds very high cost since it expands the matrixQR factorization changes the system matrix intosolvable form without computation of ATA andwithout expanding the matrix" R1,1 0 " A Q 0 0 "#0 Manfred Huber 2011R1,2 ! R1,n %'R2,2 ! R2,n '" # " '" R%'! 0 Rn,n ' Q '# 0&0 ! 0 ''" # " ''0 ! 0 &(QT Ax QT b11

QR Factorizationn n !The important part of Q for the solution consists of thefirst n rows since the others are multiplied by 0QT (Q1Q2 ) " Q1T Ax Rx Q1T bQR factorization factors the system matrix A into Q anR where R is upper triangularn As in LU Factorization, QT represents a sequence of solutionpreserving transformationsn n n In LU Factorization only identity has to be preserved which can be doneusing elimination matricesIn QR Factorization the least square characteristic has to be preservedwhich requires the use of orthogonal transformationsR is an upper triangular matrix that can be used for backwardsubstitution to compute the solution Manfred Huber 201112

Orthogonal Transformationsn Orthogonal transformations preserve the least squaresolutionQQT I2TTTTQv 2 (Qv) Qv v Q Qv v v vn !22For QR factorization we need to find a set of orthogonaltransformations that can transform A into an uppertriangular matrix R and that are numerically stablen Householder transformsn Givens rotationsn Gram-Schmidt orthogonalization Manfred Huber 201113

QR Factorization withHouseholder Transformationsn Householder transformations allow to zero all entriesbelow a chosen point in a vector an n Applying this consecutively for every column of the matrix A,choosing the diagonal element as the one below whicheverything is to be zeroed out we can construct RHouseholder transformation can be computed from a givenvector (vector of column values), a, setting all the values thatare not to be changed (entries above the diagonal) to 0vv TQ H I " 2 T , v a " #ei , # av vn !n 2The sign for β can be chosen to avoid cancellation (loss ofsignificance) in the computation of vH is orthogonal and symmetric: Manfred Huber 2011H H T H "114

QR Factorization withHouseholder Transformationsn In QR factorization with Householder transforms,successive columns are adjusted to upper triangular byzeroing all entries below the diagonal element.n Householder transform does not affect the entries in thecolumns to the left and the rows above the currently chosendiagonal element since it contains only 0s in the rows belowthe chosen diagonal term.n Applying the Householder transform only to the columns to the right savessignificant processing timeT!T!T &T#vava!vv !!vv !!!Ha j % I " 2 T ( a j a j " 2 T a j a j " 2v T j a j " 2 T j vv v'v vv vv v n Applying it to individual columns also eliminates the need to compute full!matrix H - only v a j " #e j is needed Manfred Huber 2011!15

QR Factorization Examplen Quadratic polynomial fit to 5 data pointsData : ("1,1),("0.5,0.5),(0,0),(0.5,0.5),(1,2)n !n !Design linear#1 x1%%1 x 2!A" %1 x 3%%1 x 4% 1 x 5system for data fitting# y1 x12 &(% (%y1x 22 (2(%%2( !x 3 " ) % y 3 ( * %1(% (%yx 42 (% 4(%1% (%(x 52 ' y5 ' 1# 1& 11 &(% ( 0.5 0.25(0.5(%!00 (" ) % 0 ((% (0.5 0.25(% 0.5((% (11 ' 2'Starting with the first column start eliminating entries belowthe diagonal using appropriate Householder transformschoosing the sign on β to avoid cancellation Manfred Huber 201116

QR Factorization Examplen Householder elimination by columnn 1st column:#1& # 3.236&%(% ( %(101%(% ( %(!!v1 a1 " a1 2 e1 %1( 5 %0( % 1 (%(% ( %(101%% ((%% (( %%(( 1' 0' 1 'n !Negative sign on β because potentially cancelling term is 1#"2.236# "1.789&0"1.118&%(%(0"0.191"0.405"0.362%((! %H1 A % 00.309 "0.655( , H1 y % "0.862(%(%(00.809"0.405"0.362%%((%%((1.309 0.345 ' 0 1.138 ' Manfred Huber 201117

QR Factorization Examplen Householder elimination by columnn 2nd column:# 0 &# 0& # 0 &%(% ( %("0.1911"1.772%(% ( %(!!v 2 a2 " a2 2 e2 % 0.309 ( " 2.5 % 0( % 0.309 (%(% ( %(0.80900.809%%((%% (( %%(( 1.309 ' 0' 1.309 'n !Positive sign on β because possibly cancelling term is -0.191#"2.236#"1.789&0"1.118&%(%(01.58100.632%((! %H 2 H1 A % 00"0.725( , H 2 H1 y %"1.035(%(%(00"0.589"0.816%%((%%((00.047 ' 0 0.404 ' Manfred Huber 201118

QR Factorization Examplen Householder elimination by columnn 3rd column:# 0 &# 0& # 0 &%(% ( %(000%(% ( %(!!v 2 a3 " a3 2 e3 % "0.725( " 0.875 % 1( % "1.66 (%(% ( %("0.5890"0.589%%((%% (( %%(( 0.047 ' 0' 0.047 'n !Positive sign on β because possibly cancelling term is -0.725#"2.236#"1.789&0"1.118&%(%(01.58100.632%((! %H 3 H 2 H1 A % 000.935 ( , H 3 H 2 H1 y % 1.336 (%(%(0000.026%%((%%((00 ' 0 0.337 ' Manfred Huber 201119

QR Factorization Examplen Backward substitution with the upper triangular matrixyields the parameters and least squares fit secondorder polynomial#0.086&continuedExample,! %2(p(x) 0.086 0.4x 1.429x" % 0.4Resulting( curve and original data points are shown in graph%( 1.429 'Least Squares Data FittingExistence, Uniqueness, and ConditioningSolving Linear Least Squares ProblemsLeast SquaresData Fitting!! Manfred Huber 201120 interactive example

Other OrthogonalTransformationsn n Other transformations can be used for QR Factorizationn Givens rotationsn Gram-Schmidt OrthogonalizationHouseholder Transformations generally achieve thebest performance and stability tradeoffn Complexity of QR Factorization with Householdertransformations is approximately mn2-n3/3 multiplicationsn n Depending on the size of m (data points / equations) this is between thesame and two times the work of normal equationsConditioning of QR Factorization with Householdertransformations is optimal Manfred Huber 2011cond(A) r2(cond(A))221

QR Factorizationn Transformations for QR Factorization are numericallymore stable than elimination steps in LU Factorizationn Choice of sign in Householder transformations allows to avoidcancellation and thus instabilities in individual transformsn n Row Pivoting is usually not necessaryStability leads to QR factorization also frequently beingused instead of LU Factorization to solve nonsingularsystems of linear equationsn Increased complexity is traded off against stability (and thusprecision) of the solution Manfred Huber 201122

Nonlinear Least Squaresn As for equation solving, finding solutions for generalnonlinear systems requires iterative solutionsn Goal is to find the approximate solution with the smallestsquare residual, ρ, for the system of functions f(x)ri (x) bi " f i (x)1#(x) rT (x)r(x)2n At the minimum, the gradient of the square residual function,would be 0T"#(x) J r (x)r(x) 0!n Could use Multivariate Newton method to find the root of the derivativewhich would require second derivative, the Hessian of the square errorT Manfred Huber 2011!mH " (x) J r (x)J r (x) # ri (x)H ri (x)i 123

Gauss-Newton Methodn Multivariate Newton method would require solving thefollowing linear system in each iterationm#% T&J r (x)J r (x) " ri (x)H ri (x)(s )J rT (x)r(x)i 1 'n !n Requires frequent computation of the Hessian which isexpensive and reduces stabilityGauss-Newton avoids this by dropping second order termJ rT (x)J r (x)s "J T (x)r(x)n !This is best solved by converting this into the correspondingleast squares problem and using QR factorizationJ r (x)s " #r(x)n Once solved, Gauss-Newton operates like Multivariate Newton Manfred Huber 2011x t 1 x t s24

Gauss-Newton Methodn Gauss-Newton method replaces nonlinear least squaresproblem with a sequence of linear least squaresproblems that converge to solution of nonlinear systemn Converges if residual at the solution is not too largen n Large residual at solution can lead to large values in Hessian, potentiallyleading to slow convergence or, in extreme cases, non-convergenceIf it does not converge (large residual at solution) othermethods have to be usedn Levenberg-Marquardt method which uses an additional scalar parameter(and a separate, function-specific strategy to choose it) to modify step size(Jn (x)J r (x) µI) s "J (x)r(x)Tr# J r (x)' "r(x)'&)s * &)% µI ( % 0 (General optimization using the complete Hessian Manfred Huber 2011!Trn Significant increase in computational complexity25

Gauss-Newton Examplen Fit exponential function to dataData : (0,2),(1,0.7),(2,0.3),(3,0.1)f (", x) "1e" 2 xn !!Residual function for data fitting is given by 2 # f (",0) '&)0.7#f(",1))r(" ) && 0.3 # f (",2))&)% 0.1# f (",3)(n Resulting Jacobian is! Manfred Huber 2011 #1& "2#eJ r (" ) & 2" 2& #e& 3" 2% #e0 ')#"1e" 2 )#2"1e 2" 2 ))#3"1e 3" 2 (26

Gauss-Newton Examplen First iteration starting at α(0) (1 0)Tn Initial square residual:"1%42r ' ) ( y i (1e 0 ) 2.39i 1#0& 2n "(1 0 % " (1 %""1%% (1 (1' 0.3''s ) 'J r '' s ##0&& (1 (2' 0.7' ' '#(1 (3& # 0.9&!n !Solve for step*" 0.69 %s '# (0.61&Update parameter vector for next iteration#1& # 0.69 & # 1.69 &(1)" % ( %( %( 0' )0.61' )0.69' Manfred Huber 201127

Gauss-Newton Examplen Second Iterationn Square residual# 1.69 &4"0.61x i 2r%( )i 1 ( y i "1.69e) 0.212 "0.61' 2n # "10 & # "0.31&## 1.69 && % "0.543 "0.918( %0.218((s ) %(J r %%((s % "0.61'' % "0.295 "0.998( %0.199(%( %( "0.16 "0.813' 0.171'!n !Solve for next step*# 0.285&s %( "0.32'Update parameter vector for next iteration 1.69 ' 0.285' 1.975 '(2)" &) &) &)%#0.61( % #0.32( %#0.93( Manfred Huber 201128

Gauss-Newton Examplen Third Iterationn Square residual#1.975 &4"0.93x i 2r%( )i 1 ( y i "1.975e) 0.007 "0.93' 2n # "10 & #"0.025&##1.975 && % "0.395 "0.779( % 0.079 ((s ) %(J r %%((s % "0.93'' % "0.156 "0.615( % 0.007 (%( %("0.061"0.3640.02 ' '!n !Solve for next stepUpdate parameter vector for next iteration 1.975 ' 0.019 ' 1.994 '(3)" &) &) &)%#0.93( % #0.074 ( %#1.004 ( Manfred Huber 2011!*# 0.019 &s %("0.074 '29

Least Squares Approximationn Least Squares approximation is used to determineapproximate solutions for a system of equation or to fitan approximate function to a set of data pointsn n As opposed to interpolation data points are not met preciselyn Applicable to noisy data pointsn Allows less complex function to be fitted to the data pointsLeast squares for linear functions has direct solution methodsn n n Normal equations method not very stable for large numbers of equationsQR Factorization provides a more stable alternative that can also be usedfor equation solving instead of LU factorizationLeast squares for nonlinear systems requires iterative solutionn Gauss-Newton provides robust solution for problems with small residualn Otherwise general, more expensive optimization methods have to be used Manfred Huber 201130

Linear Least Squares ! Linear least squares attempts to find a least squares solution for an overdetermined linear system (i.e. a linear system described by an m x n matrix A with more equations than parameters). ! Least squares minimizes the squared Eucliden norm of the residual ! For data fitting on m data points using a linear

Related Documents:

92 vipul sharma it 93 rishabh jain cse 94 manik arora cse 95 nishant bhardwaj cse . 96 rajit shrivastava it 97 shivansh gaur cse 98 harsh singh cse 99 shreyanshi raj cse 100 rahul bedi cse 101 pallavi anand cse 102 divya cse 103 nihal raj it 104 kanak

cse-148 kuriakose jijo george n t george cse-149 kusum joshi ramesh chandra joshi cse-150 m mithun bose n k mohandasan cse-151 madhuri yadav rajbir yadav cse-152 malini shukla r s sharma cse-153 manisha khattar sunil kumar khattar cse-154 m

least-squares finite element models of nonlinear problems – (1) Linearize PDE prior to construction and minimization of least-squares functional Element matrices will always be symmetric Simplest possible form of the element matrices – (2) Linearize finite element equations following construction and minimization of least-squares. functional

5.8.1 The Compatible Least-Squares Finite Element Method with a Reaction Term 177 5.8.2 The Compatible Least-Squares Finite Element Method Without a Reaction Term 181 5.9 Practicality Issues 182 5.9.1 Practical Rewards of Compatibility 184 5.9.2 Compatible Least-Squares Finite Element Methods on Non-Affine Grids 190

An adaptive mixed least-squares finite element method for . Least-squares Raviart–Thomas Finite element Adaptive mesh refinement Corner singularities 4:1 contraction abstract We present a new least-squares finite element method for the steady Oldroyd type viscoelastic fluids.

3.2 Least-squares regression, Interpreting a regression line, Prediction, Technology: Least-Squares Regression Lines on the Calculator Interpret the slope and y intercept of a least-squares regression line in context. Use the least-squares regression line to predict y f

Bodies Moving About the Sun in Conic Sections", and in it he used the method of least squares to calculate the shapes of orbits. Legendre published about least squares in 1805, 4 years before. However, Gauss claimed to have known about least squares in 1795. .

Please find below a 12 week beginner, sprint distance triathlon training plan to help you prepare for your event. This 12 week training plan is designed to get a novice triathlete through a sprint distance triathlon. It is not a complex or hugely time consuming programme, it will get you to the finish line in good shape. In order to be able complete the training youshould have a reasonable .