Linear Programming: Interior-Point Methods

3y ago
37 Views
2 Downloads
470.89 KB
26 Pages
Last View : 1m ago
Last Download : 3m ago
Upload by : Olive Grimm
Transcription

Chapter 14Linear Programming:Interior-Point MethodsIn the 1980s it was discovered that many large linear programs could be solvedefficiently by formulating them as nonlinear problems and solving them withvarious modifications of nonlinear algorithms such as Newton’s method. Onecharacteristic of these methods was that they required all iterates to satisfythe inequality constraints in the problem strictly, so they became known asinterior-point methods. By the early 1990s, one class—primal-dual methods—had distinguished itself as the most efficient practical approach, and provedto be a strong competitor to the simplex method on large problems. Thesemethods are the focus of this chapter.Interior-point methods arose from the search for algorithms with better theoretical properties than the simplex method. As we mentioned in Chapter ?,the simplex method can be inefficient on certain pathological problems. Roughlyspeaking, the time required to solve a linear program may be exponential in thesize of the problem, as measured by the number of unknowns and the amountof storage needed for the problem data. For almost all practical problems,the simplex method is much more efficient than this bound would suggest, butits poor worst-case complexity motivated the development of new algorithmswith better guaranteed performance. The first such method was the ellipsoidmethod, proposed by Khachiyan [11], which finds a solution in time that is atworst polynomial in the problem size. Unfortunately, this method approachesits worst-case bound on all problems and is not competitive with the simplexmethod in practice.Karmarkar’s projective algorithm [10], announced in 1984, also has the polynomial complexity property , but it came with the added inducement of goodpractical behavior. The initial claims of excellent performance on large linearprograms were never fully borne out, but the announcement prompted a greatdeal of research activity and a wide array of methods described by such labels as“affine-scaling,” “logarithmic barrier,” “potential-reduction,” “path-following,”1

CHAPTER 14. INTERIOR-POINT METHODS2“primal-dual,” and “infeasible interior-point.” All are related to Karmarkar’soriginal algorithm, and to the log-barrier approach described in Chapter ?,but many of the approaches can be motivated and analyzed independently ofthe earlier methods.Interior-point methods share common features that distinguish them fromthe simplex method. Each interior-point iteration is expensive to compute andcan make significant progress towards the solution, while the simplex methodusually requires a larger number of inexpensive iterations. Geometrically speaking, the simplex method works its way around the boundary of the feasiblepolytope, testing a sequence of vertices in turn until it finds the optimal one.Interior-point methods approach the boundary of the feasible set only in thelimit. They may approach the solution either from the interior or the exteriorof the feasible region, but they never actually lie on the boundary of this region.In this chapter, we outline some of the basic ideas behind primal-dual interiorpoint methods, including the relationship to Newton’s method and homotopymethods and the concept of the central path. We sketch the important methodsin this class, and give a comprehensive convergence analysis of a long-step pathfollowing method. We describe in some detail a practical predictor-correctoralgorithm proposed by Mehrotra, which is the basis of much of the currentgeneration of software.14.1Primal-Dual MethodsOutlineWe consider the linear programming problem in standard form; that is,min cT x, subject to Ax b, x 0,(14.1)where c and x are vectors in IRn , b is a vector in IRm , and A is an m n matrix.The dual problem for (14.1) ismax bT λ, subject to AT λ s c, s 0,(14.2)where λ is a vector in IRm and s is a vector in IRn . As shown in Chapter ?, solutions of (14.1),(14.2) are characterized by the Karush-Kuhn-Tucker conditions(?), which we restate here as follows:AT λ s c,(14.3a)Axxi si b,0, i 1, 2, . . . , n,(14.3b)(14.3c)(x, s) 0.(14.3d)Primal-dual methods find solutions (x , λ , s ) of this system by applying variants of Newton’s method to the three equalities in (14.3) and modifying the

14.1. PRIMAL-DUAL METHODS3search directions and step lengths so that the inequalities (x, s) 0 are satisfiedstrictly at every iteration. The equations (14.3a),(14.3b),(14.3c) are only mildlynonlinear and so are not difficult to solve by themselves. However, the problem becomes much more difficult when we add the nonnegativity requirement(14.3d). This nonnegativity condition is the source of all the complications inthe design and analysis of interior-point methods.To derive primal-dual interior-point methods we restate the optimality conditions (14.3) in a slightly different form by means of a mapping F from IR2n mto IR2n m : T A λ s c 0,Ax bF (x, λ, s) (14.4a)XSe(x, s) 0,(14.4b)whereX diag(x1 , x2 , . . . , xn ),S diag(s1 , s2 , . . . , sn ),Tk(14.5)kkand e (1, 1, . . . , 1) . Primal-dual methods generate iterates (x , λ , s ) thatsatisfy the bounds (14.4b) strictly, that is, xk 0 and sk 0. This property isthe origin of the term interior-point. By respecting these bounds, the methodsavoid spurious solutions, that is, points that satisfy F (x, λ, s) 0 but not(x, s) 0. Spurious solutions abound, and do not provide useful informationabout solutions of (14.1) or (14.2), so it makes sense to exclude them altogetherfrom the region of search.Many interior-point methods actually require the iterates to be strictly feasible; that is, each (xk , λk , sk ) must satisfy the linear equality constraints forthe primal and dual problems. If we define the primal-dual feasible set F andstrictly feasible set F o byFFo {(x, λ, s) Ax b, AT λ s c, (x, s) 0},T{(x, λ, s) Ax b, A λ s c, (x, s) 0},(14.6a)(14.6b)the strict feasibility condition can be written concisely as(xk , λk , sk ) F o .Like most iterative algorithms in optimization, primal-dual interior-pointmethods have two basic ingredients: a procedure for determining the step and ameasure of the desirability of each point in the search space. The procedure fordetermining the search direction procedure has its origins in Newton’s methodfor the nonlinear equations (14.4a). Newton’s method forms a linear model forF around the current point and obtains the search direction ( x, λ, s) bysolving the following system of linear equations: xJ(x, λ, s) s F (x, λ, s), λ

CHAPTER 14. INTERIOR-POINT METHODS4where J is the Jacobian of F . (See Chapter ? for a detailed discussion ofNewton’s method for nonlinear systems.) If the current point is strictly feasible(that is, (x, λ, s) F o ), the Newton step equations become x0 AT I0 A 0 .0 λ 0(14.7)S0 X s XSeUsually, a full step along this direction would violate the bound (x, s) 0. Toavoid this difficulty, we perform a line search along the Newton direction so thatthe new iterate is(x, λ, s) α( x, λ, s),for some line search parameter α (0, 1]. Unfortunately, we often can takeonly a small step along this direction (α 1) before violating the condition(x, s) 0. Hence, the pure Newton direction (14.7), which is known as theaffine scaling direction, often does not allow us to make much progress towarda solution.Most primal-dual methods modify the basic Newton procedure in two important ways:1. They bias the search direction toward the interior of the nonnegative orthant (x, s) 0, so that we can move further along the direction beforeone of the components of (x, s) becomes negative.2. They prevent the components of (x, s) from moving “too close” to theboundary of the nonnegative orthant.To describe these modifications, we need to introduce the concept of the centralpath, and of neighborhoods of this path.The Central PathThe central path C is an arc of strictly feasible points that plays a vital rolein primal-dual algorithms. It is parametrized by a scalar τ 0, and each point(xτ , λτ , sτ ) C solves the following system:AT λ sAx c,b,xi si(x, s) τ,0.(14.8a)(14.8b)i 1, 2, . . . , n,(14.8c)(14.8d)These conditions differ from the KKT conditions only in the term τ on theright-hand side of (14.8c). Instead of the complementarity condition (14.3c), werequire that the pairwise products xi si have the same (positive) value for allindices i. From (14.8), we can define the central path asC {(xτ , λτ , sτ ) τ 0}.

14.1. PRIMAL-DUAL METHODS5It can be shown that (xτ , λτ , sτ ) is defined uniquely for each τ 0 if and onlyif F o is nonempty.Another way of defining C is to use the mapping F defined in (14.4) andwrite 0(14.9)F (xτ , λτ , sτ ) 0 ,(xτ , sτ ) 0.τeThe equations (14.8) approximate (14.3) more and more closely as τ goesto zero. If C converges to anything as τ 0, it must converge to a primal-dualsolution of the linear program. The central path thus guides us to a solutionalong a route that maintains positivity of the x and s components and decreasesthe pairwise products xi si , i 1, 2, . . . , n to zero at the same rate.Primal-dual algorithms take Newton steps toward points on C for whichτ 0, rather than pure Newton steps for F . Since these steps are biasedtoward the interior of the nonnegative orthant defined by (x, s) 0, it usuallyis possible to take longer steps along them than along the pure Newton (affinescaling) steps, before violating the positivity condition.To describe the biased search direction, we introduce a centering parameterσ [0, 1] and a duality measure µ defined bynµ xT s1 ,xi si n i 1n(14.10)which measures the average value of the pairwise products xi si . By fixingτ σµ and applying Newton’s method to the system (14.9), we obtain x0 AT I0 A 0 .0 λ 0(14.11) s XSe σµeS0 XThe step ( x, λ, s) is a Newton step toward the point (xσµ , λσµ , sσµ ) C, atwhich the pairwise products xi si are all equal to σµ. In contrast, the step (14.7)aims directly for the point at which the KKT conditions (14.3) are satisfied.If σ 1, the equations (14.11) define a centering direction, a Newton steptoward the point (xµ , λµ , sµ ) C, at which all the pairwise products xi si areidentical to the current average value of µ. Centering directions are usuallybiased strongly toward the interior of the nonnegative orthant and make little,if any, progress in reducing the duality measure µ. However, by moving closerto C, they set the scene for a substantial reduction in µ on the next iteration. Atthe other extreme, the value σ 0 gives the standard Newton (affine scaling)step (14.7). Many algorithms use intermediate values of σ from the open interval(0, 1) to trade off between the twin goals of reducing µ and improving centrality.A Primal-Dual FrameworkWith these basic concepts in hand, we can define a general framework forprimal-dual algorithms.

CHAPTER 14. INTERIOR-POINT METHODS6Framework 14.1 (Primal-Dual)Given (x0 , λ0 , s0 ) F ofor k 0, 1, 2, . . .Solve 0 ASkAT00 xk0I ,0 λk 0kk kk X S e σk µk eX s(14.12)where σk [0, 1] and µk (xk )T sk /n;Set(xk 1 , λk 1 , sk 1 ) (xk , λk , sk ) αk ( xk , λk , sk ),(14.13)choosing αk so that (xk 1 , sk 1 ) 0.end (for).The choices of centering parameter σk and step length αk are crucial tothe performance of the method. Techniques for controlling these parameters,directly and indirectly, give rise to a wide variety of methods with differenttheoretical properties.So far, we have assumed that the starting point (x0 , λ0 , s0 ) is strictly feasibleand, in particular, that it satisfies the linear equations Ax0 b, AT λ0 s0 c.All subsequent iterates also respect these constraints, because of the zero righthand side terms in (14.12). For most problems, however, a strictly feasiblestarting point is difficult to find. Infeasible-interior-point methods require onlythat the components of x0 and s0 be strictly positive. The search directionneeds to be modified so that it improves feasibility as well as centrality at eachiteration, but this requirement entails only a slight change to the step equation(14.11). If we define the residuals for the two linear equations asrb Ax b,rc AT λ s c,the modified step equation is 0 AT I x rc A 0 .0 λ rb XSe σµeS0 X s(14.14)(14.15)The search direction is still a Newton step toward the point (xσµ , λσµ , sσµ ) C.It tries to correct all the infeasibility in the equality constraints in a single step.If a full step is taken at any iteration, (that is, αk 1 for some k), the residualsrb and rc become zero, and all subsequent iterates remain strictly feasible.We discuss infeasible-interior-point methods further in Section 14.3.Central Path Neighborhoods and Path-Following MethodsPath-following algorithms explicitly restrict the iterates to a neighborhood ofthe central path C and follow C to a solution of the linear program. By preventing

14.1. PRIMAL-DUAL METHODS7central path neighborhoodCFigure 14.1: Central path, projected into space of primal variables x, showinga typical neighborhood Nthe iterates from coming too close to the boundary of the nonnegative orthant,they ensure that search directions calculated from each iterate make at least aminimal amount of progress toward the solution.A key ingredient of most optimization algorithms is a measure of the desirability of each point in the search space. In path-following algorithms, theduality measure µ defined by (14.10) fills this role. By forcing the duality measure µk to zero as k , we ensure that the iterates (xk , λk , sk ) come closerand closer to satisfying the KKT conditions (14.3).The two most interesting neighborhoods of C are the so-called 2-norm neighborhood N2 (θ) defined byN2 (θ) {(x, λ, s) F o XSe µe 2 θµ},(14.16)for some θ [0, 1), and the one-sided -norm neighborhood N (γ) definedby(14.17)N (γ) {(x, λ, s) F o xi si γµ all i 1, 2, . . . , n},for some γ (0, 1]. (Typical values of the parameters are θ 0.5 and γ 10 3 .)If a point lies in N (γ), each pairwise product xi si must be at least some smallmultiple γ of their average value µ. This requirement is actually quite modest,and we can make N (γ) encompass most of the feasible region F by choosingγ close to zero. The N2 (θ) neighborhood is more restrictive, since certain pointsin F o do not belong to N2 (θ) no matter how close θ is chosen to its upper boundof 1.By keeping all iterates inside one or other of these neighborhoods, pathfollowing methods reduce all the pairwise products xi si to zero at more or lessthe same rate. Figure 14.1 shows the projection of the central path C onto theprimal variables for a typical problem, along with a typical neighborhood N .Path-following methods are akin to homotopy methods for general nonlinearequations, which also define a path to be followed to the solution. Traditional

CHAPTER 14. INTERIOR-POINT METHODS8homotopy methods stay in a tight tubular neighborhood of their path, makingincremental changes to the parameter and chasing the homotopy path all theway to a solution. For primal-dual methods, this neighborhood is conical ratherthan tubular, and it tends to be broad and loose for larger values of the dualitymeasure µ. It narrows as µ 0, however, because of the positivity requirement(x, s) 0.The algorithm we specify below, a special case of Framework 14.1, is knownas a long-step path-following algorithm. This algorithm can make rapid progressbecause of its use of the wide neighborhood N (γ), for γ close to zero. Itdepends on two parameters σmin and σmax , which are upper and lower boundson the centering parameter σk . The search direction is, as usual, obtained bysolving (14.12), and we choose the step length αk to be as large as possible,subject to the requirement that we stay inside N (γ).Here and in later analysis, we use the notation(xk (α), λk (α), sk (α))µk (α)def (xk , λk , sk ) α( xk , λk , sk ), (14.18a)defxk (α)T sk (α)/n. (14.18b)Algorithm 14.2 (Long-Step Path-Following)Given γ, σmin , σmax with γ (0, 1), 0 σmin σmax 1,and (x0 , λ0 , s0 ) N (γ);for k 0, 1, 2, . . .Choose σk [σmin , σmax ];Solve (14.12) to obtain ( xk , λk , sk );Choose αk as the largest value of α in [0, 1] such that(xk (α), λk (α), sk (α)) N (γ);k 1Set (xend (for).k 1,λ,sk 1kk(14.19)k) (x (αk ), λ (αk ), s (αk ));Typical behavior of the algorithm is illustrated in Figure 14.2 for the caseof n 2. The horizontal and vertical axes in this figure represent the pairwise products x1 s1 and x2 s2 , so the central path C is the line emanating fromthe origin at an angle of 45 . (A point at the origin of this illustration is aprimal-dual solution if it also satisfies the feasibility conditions (14.3a), (14.3b),and (14.3d).) In the unusual geometry of Figure 14.2, the search directions( xk , λk , sk ) transform to curves rather than straight lines.As Figure 14.2 shows (and the analysis confirms), the lower bound σminon the centering parameter ensures that each search direction starts out bymoving away from the boundary of N (γ) and into the relative interior ofthis neighborhood. That is, small steps along the search direction improve thecentrality. Larger values of α take us outside the neighborhood again, since theerror in approximating the nonlinear system (14.9) by the linear step equations(14.11) becomes more pronounced as α increases. Still, we are guaranteed thata certain minimum step can be taken before we reach the boundary of N (γ),as we show in the analysis below.

14.2. ANALYSIS OF ALGORITHM 14.2x2 s21iterates90central path C23boundary of neighborhood Nx1 s1Figure 14.2: Iterates of Algorithm 14.2, plotted in (xs) spaceWe present a complete analysis of Algorithm 14.2, which makes use of surprisingly simple mathematical foundations, in Section 14.2 below. With judicious choices of σk , this algorithm is fairly efficient in practice. With a few morechanges it becomes the basis of a truly competitive method, as we discuss inSection 14.3.An infeasible-interior-point variant of Algorithm 14.2 can be constructed bygeneralizing the definition of N (γ) to allow violation of the feasibility conditions. In this extended neighborhood, the residual norms rb and rc arebounded by a constant multiple of the duality measure µ. By squeezing µ tozero, we also force rb and rc to zero, so that the iterates approach complementarity and feasibility at the same time.14.2Analysis of Algorithm 14.2We now present a comprehensive analysis of Algorithm 14.2. Our aim is to showthat given some small tolerance 0, and given some mild assumptions aboutthe starting point (x0 , λ0 , s0 ), the algorithm requires O(n log ) iterations toidentify a point (xk , λk , sk ) for which µk , where µk (xk )T sk /n. For small , the point (xk , λk , sk ) satisfies the primal-dual optimality conditions except forperturbations of about in the right-hand side of (14.3c), so it is usually veryclose to a primal-dual solution of the original linear program. The O(n log )estimate is a worst-case bound on the number of iterations required; on practicalproblems, the number of iterations required appears to increase only slightly as

CHAPTER 14. INTERIOR-POINT METHODS10n increases. The simplex method may require 2n iterations to solve a problemwith n variables, though in practice it usually requires a modest multiple ofmax(m, n) iterations, where m is the row dimension of the constraint matrix Ain (14.1).As is typical for interior-point methods, the analysis builds from a purelytechnical lemma to a powerful theorem in just a few pages. We start with thetechnical lemma—Lemma 14.1—and use it to prove Lemma 14.2, a bound onthe vector of pairwise products xi si , i 1, 2, . . . , n. Theorem 14.3 finds alower bound on the step length αk and a corresponding estimate of the reductionin µ on iteration k. Finally, Theorem 14.4 proves that O

Linear Programming: Interior-Point Methods . speaking, the time required to solve a linear program may be exponential in the . the design and analysis of interior-point methods. To derive primal-dual interior-point methods we restate the optimality con-ditions (14.3) .

Related Documents:

Linear Programming CISC5835, Algorithms for Big Data CIS, Fordham Univ. Instructor: X. Zhang Linear Programming In a linear programming problem, there is a set of variables, and we want to assign real values to them so as to satisfy a set of linear equations

Brief History of Linear Programming 2 The goal of linear programming is to determine the values of decision variables that maximize or minimize a linear objective function, where the decision variables are subject to linear constraints. A linear programming problem is a special case of a general constra

SKF Linear Motion linear rail INA Linear SKF Linear Motion Star Linear Thomson Linear linear sysTems Bishop-Wisecarver INA Linear Pacific Bearing Thomson Linear mecHanical acTuaTors Duff-Norton Joyce/Dayton Nook Industries Precision mecHanical comPonenTs PIC Design Stock Drive Product

Sep 25, 2007 · A linear program is an optimization problem where all involved functions are linear in x; in particular, all the constraints are linear inequalities and equalities. Linear programming is the subject of studying and solving linear programs. Linear programming was born during the second World

Interior Design I L1 Foundations of Design* L1 Interior Design II L2 Interior Design II LAB* L2L Interior Design III L3C Interior Design III LAB* L3L Interior Design Advanced Studies * AS *Complementary Courses S TATE S KILL S TANDARDS The state skill standards are designed to clearly state what the student should know and be able to do upon

4. The objective and constraints in linear programming problems must be expressed in terms of linear equations or inequalities. FORMULATING LINEAR PROGRAMMING PROBLEMS One of the most common linear programming applications is the product-mix problem. Two or more products are usually produced using limited resources.

mx b a linear function. Definition of Linear Function A linear function f is any function of the form y f(x) mx b where m and b are constants. Example 2 Linear Functions Which of the following functions are linear? a. y 0.5x 12 b. 5y 2x 10 c. y 1/x 2 d. y x2 Solution: a. This is a linear function. The slope is m 0.5 and .

Mini-course on Rough Paths (TU Wien, 2009) P.K. Friz, Last update: 20 Jan 2009. Contents Chapter 1. Rough Paths 1 1. On control ODEs 1 2. The algebra of iterated integrals 6 3. Rough Path Spaces 14 4. Rough Path Estimates for ODEs I 20 5. Rough Paths Estimates for ODEs II 23 6. Rough Di erential Equations 25 Chapter 2. Applications to Stochastic Analysis 29 1. Enhanced Brownian motion as .