A Dynamic Near-Optimal Algorithm For Online Linear

1y ago
18 Views
1 Downloads
389.67 KB
22 Pages
Last View : 4d ago
Last Download : 27d ago
Upload by : River Barajas
Transcription

A Dynamic Near-Optimal Algorithm for Online Linear ProgrammingShipra AgrawalDepartment of Computer Science, Stanford University, Stanford, CAemail: [email protected] WangDepartment of Management Science and Engineering, Stanford University, Stanford, CAemail: [email protected] YeDepartment of Management Science and Engineering, Stanford University, Stanford, CAemail: [email protected] natural optimization model that formulates many online resource allocation and revenue management problemsis the online linear program (LP) where the constraint matrix is revealed column by column along with theobjective function. We provide a near-optimal algorithm for this surprisingly general class of online problemsunder the assumption of random order of arrival and some mild conditions on the size of the LP right-hand-sideinput. Our learning-based algorithm works by dynamically updating a threshold price vector at geometric timeintervals, where the dual prices learned from revealed columns in the previous period are used to determine thesequential decisions in the current period. Our algorithm has a feature of “learning by doing”, and the pricesare updated at a carefully chosen pace that is neither too fast nor too slow. In particular, our algorithm doesn’tassume any distribution information on the input itself, thus is robust to data uncertainty and variations due toits dynamic learning capability. Applications of our algorithm include many online multi-resource allocation andmulti-product revenue management problems such as online routing and packing, online combinatorial auctions,adwords matching, inventory control and yield management.Key words: online algorithms; linear programming; primal-dual; dynamic price updateMSC2000 Subject Classification: Primary: 68W27, 90B99; Secondary: 90B05, 90B50, 90C05OR/MS subject classification: Primary: Linear Programming; analysis of algorithms1. Introduction Online optimization is attracting an increasingly wide attention in computer science, operations research, and management science communities because of its wide applications toelectronic markets and dynamic resource allocation problems. In many practical problems, data does notreveal itself at the beginning, but rather comes in an online fashion. For example, in the online revenuemanagement problem, consumers arrive sequentially requesting a subset of goods (multi-leg flights or aperiod of stay in a hotel), each offering a certain bid price for his demand. On observing a request, theseller needs to make an irrevocable decision for that consumer with the overall objective of maximizingthe revenue while respecting the resource constraints. Similarly, in the online routing problem [8, 3],the central organizer receives demands for subsets of edges in a network from the users in a sequentialmanner, each with a certain utility and bid price for his demand. The organizer needs to allocate thenetwork capacity online to those bidders to maximize social welfare. A similar format also appears inonline auctions [4, 10], online keyword matching problems [13, 20, 23, 16], online packing problems [9],and various other online revenue management and resource allocation problems [22, 11, 6].In all these examples mentioned above, the problem can be formulated as an online linear programmingproblem1 . In an online linear programming problem, the constraint matrix is revealed column by columnwith the corresponding coefficient in the objective function. After observing the input arrived so far,the online algorithm must make the current decision without observing the future data. To be precise,consider the linear programPnmaximizeπ xPnj 1 j jsubject toa(1)j 1 ij xj bi , i 1, . . . , m0 xj 1j 1, . . . , nmmmwhere j, πj 0, aj (aij )mi 1 [0, 1] , and b {bi }i 1 R . In the corresponding online problem,at time t, the coefficients (πt , at ) are revealed, and the algorithm must make a decision xt . Given the1 In fact, many of the problems mentioned above take the form of an integer program. While we discuss our ideas andresults in terms of linear relaxation of these problems, our results naturally extends to integer programs. See Section 5.4for the detailed discussion.1

2Agrawal et al.: A Dynamic Near-Optimal Algorithm for Online Linear ProgrammingMathematics of Operations Research xx(x), pp. xxx–xxx, c 200x INFORMSprevious t 1 decisions x1 , . . . , xt 1 , and input {πj , aj }tj 1 until time t, the tth decision is to select xtsuch thatPtj 1 aij xj bi , i 1, . . . , m(2)0 xt 1.PnThe goal of the online algorithm is to choose xt ’s such that the objective function t 1 πt xt is maximized.To evaluate an online algorithm, one could consider various kinds of input models. One approach, whichis completely robust to input uncertainty, is the worst-case analysis, that is, to evaluate the algorithmbased on its performance on the worst-case input [23, 9]. However, this leads to very pessimistic boundsfor the above online problem: no online algorithm can achieve better than O(1/n) approximation of theoptimal offline solution [4]. The other approach, popular among practitioners with domain knowledge, isto assume certain distribution on the inputs and consider the expected objective value achieved by thealgorithm. In many cases, a priori input distribution can simplify the problem to a great extent, however,the choice of distribution is very critical and the performance can suffer if the actual input distributionis not as assumed. In this paper, we take an intermediate path. While we do not assume any knowledgeof the input distribution, we relax the worst-case model by making the following two assumptions:Assumption 1.1 The columns aj (with the objective coefficient πj ) arrive in a random order, i.e., theset of columns can be adversarily picked at the start. However, after they are chosen, (a1 , a2 , ., an ) and(aσ(1) , aσ(2) , ., aσ(n) ) have same chance to happen for all permutation σ.This assumption says that we consider the average behavior of the online algorithm over randompermutations. This assumption is reasonable in practical problems, since the order of columns usuallyappears to be independent of the content of the columns. We like to emphasize that this assumptionis strictly weaker than assuming an input distribution. In particular, it is automatically satisfied if theinput columns are generated independently from some common (but unknown) distributions. This is alsoa standard assumption in many existing literature on solving online problems [4, 13, 20, 17].Assumption 1.2 We know the total number of columns n a priori.The second assumption is needed since we will use quantity n to decide the length of history used tolearn the threshold prices in our algorithm. It can be relaxed to an approximate knowledge of n (withinat most 1 multiplicative error), without affecting the final results. Note that this assumption is alsostandard in many existing algorithms for online problems [4] in computer science. For many practicalproblems, the knowledge of n can be implied approximately by the length of time horizon T and thearrival rates, which is usually available. As long as the time horizon is long enough, the total numberof arrivals in the LP problem will be highly accurate. This is justified in [11] and [19] when they usethe expected number of arrivals for constructing a pricing policy in a revenue management problem.Moreover, this assumption is necessary from the algorithmic point of view. In [13], the authors showedthat if Assumption 2 doesn’t hold, then the worst-case competitive ratio is bounded away from 1 evenwe admit Assumption 1.In this paper, we present an almost optimal solution for online linear program (2) under the above twoassumptions and a lower bound condition on the size of b. We also extend our results to the followingmore general online linear optimization problems: Problems with multi-dimensional decisions at each time step. More precisely, consider a sequenceof n non-negative vectors f 1 , f 2 , . . . , f n Rk , mn non-negative vectorsg i1 , g i2 , . . . , g in [0, 1]k ,ki 1, . . . , m,Tand (k 1)-dimensional simplex K {x R : x e 1, x 0}. In this problem, giventhe previous t 1 decisions x1 , . . . , xt 1 , each time we make a k-dimensional decision xt Rk ,satisfying:PtTj 1 g ij xj bi , i 1, . . . , m(3)xt Kwhere decision Pvector xt must be chosen only using the knowledge up to time t. The objectivenis to maximize j 1 f Tj xj over the whole time horizon. Note that Problem (2) is a special caseof Problem (3) with k 1.

Agrawal et al.: A Dynamic Near-Optimal Algorithm for Online Linear ProgrammingMathematics of Operations Research xx(x), pp. xxx–xxx, c 200x INFORMS3 Problem (2) with both buy-and-sell orders, that is,mπj either positive or negative, and aj (aij )mi 1 [ 1, 1] .(4)1.1 Our key ideas and main results In the following, let OPT denote the optimal objective valuefor the offline problem (1).Definition 1.1 An online algorithm A is c-competitive in random permutation model if the expectedvalue of the online solution obtained by using A is at least c factor of the optimal offline solution. Thatis,PnEσ [ t 1 πt xt (σ, A)] cOPTwhere the expectation is taken over uniformly random permutations σ of 1, . . . , n, and xt (σ, A) is the tthdecision made by algorithm A when the inputs arrive in the order σ.Our algorithm is based on the observation that the optimal solution x for the offline linear programis almost entirely determined by the optimal dual solution p Rm corresponding to the m inequalityconstraints. The optimal dual solution acts as a threshold price so that x j 0 only if πj p T aj . Ouronline algorithm works by learning a threshold price vector from the input received so far. The pricevector then determines the decision for the next period. However, instead of computing a new pricevector at every step, the algorithm initially waits until n steps or arrivals, and then computes a newprice vector every time the history doubles. That is, at time steps n , 2n , 4n , . . . and so on. We showthat our algorithm is 1 O( )-competitive in random permutation model under a size condition of theright-hand-side input. Our main results are precisely stated as follows:Theorem 1.1 For any 0, our online algorithm is 1 O( ) competitive for the online linear program(2) in random permutation model, for all inputs such that 2 m log (n/ )B min bi Ω(5)i 2Note that the condition in Theorem 1.1 depends on log n, which is far from satisfying everyone’sdemand when n is large. In [21], the author proves that k 1/ 2 is necessary to get 1 O( ) competitiveratio in k-secretary problem, which is the single dimensional case of our problem with m 1, B kand at 1 for all t. Thus, dependence on in the above theorem is near-optimal. In the next theorem,we show that a dependence on m is necessary as well for any online algorithm to obtain a near-optimalsolution. Its proof will appear in Section 4.Theorem 1.2 For any online algorithm for linear program (2) in random permutation model, there existsan instance such that its competitive ratio is less than 1 Ω( ) whenlog(m)B min bi i 2We also extend our results to more general online linear programs as introduced in (3) and (4):Theorem 1.3 For any 0, our algorithm is 1 O( ) competitive for the general online linear problem(3) or (4) in random permutation model, for all inputs such that: m log (nk/ )B min bi Ω.(6)i 2Remark 1.1 Our condition to hold the main result is independent of the size of OPT or objective coefficients, and is also independent of any possible distribution of input data. If the largest entry of constraintcoefficients does not equal to 1, then our both theorems hold if the condition (5) or (6) is replaced by: bim log (nk/ ) Ω, i,āi 22 For any two functions f (n), g(n), f (n) O(g(n)) iff there exists some constant c such that f (n) c g(n); and11f (n) Ω(g(n)) iff there exists some constant c2 such that f (n) c2 g(n). The precise constants required here will beillustrated later in the text.

4Agrawal et al.: A Dynamic Near-Optimal Algorithm for Online Linear ProgrammingMathematics of Operations Research xx(x), pp. xxx–xxx, c 200x INFORMSwhere, for each row i, āi maxj { aij } of (2) and (4), or āi maxj {kg ij k } of (3). Note that thisbound is proportional only to log(n) so that it is way below to satisfy everyone’s demand.It is apparent that our generalized problem formulation should cover a wide range of online decisionmaking problems. In the next section, we discuss the related work and some sample applications of ourmodel. As one can see, our result in fact improves the best competitive ratios for many online problemsstudied in the literature and presents the first non-asymptotic analysis for solving many online resourceallocation and revenue management problems.1.2 Related work Online decision making has been a topic of wide recent interest in the computerscience, operations research, and management science communities. Various special cases of the generalproblem presented in this paper have been studied extensively in the computer science literature as“secretary problems”. A comprehensive survey of existing results on the secretary problems can be foundin [4]. In particular, constant factor competitive ratios have been proven for k-secretary and knapsacksecretary problems under random permutation model. Further, for many of these problems, a constantcompetitive ratio is known to be optimal if no additional conditions on input are assumed. Therefore,there has been recent interest in searching for online algorithms whose competitive ratio approaches 1 asthe input parameters become large. The first result of this kind appears in [21], where a 1 O(1/ k)competitive algorithm is presented for k secretary problem under random permutation model. Recently,the authors in [13, 15] presented a 1 O( )-competitive algorithm for online adwords matching problemunder assumptions of certain lower bounds on OPT and k mini bi in terms of and other inputparameters. Our work is closely related to the works of Kleinberg [21], Devanur et al. [13], and Feldmanet al. [15].In [21], the author presented a recursive algorithm for single dimensional multiple-choice secretaryproblem, and proved that 1 O(1/ pk) competitive ratio achieved by his algorithm is the best possible forthis problem. We present a (1 O( m log(n)/k)) competitive algorithm for multi-dimensional multiplechoice secretary problem (k mini bi ), which involves recursive pricing similar to [21], however, due tothe multi-dimensional structure of the problem, significantly different techniques are required for analysisthan those usedp in [21]. We also prove that no online algorithm can achieve a competitive ratio betterthan (1 Ω( log(m)/k)) for the multi-dimensional problem. To our knowledge, this is the first resultthat shows a dependence on dimension m, of the best competitive ratio achievable for this problem.In [13], the authors study a one-time pricing approach (as opposed to the recursive or dynamic pricingapproach) for a special case (adwords matching) of our problem. Also, the authors prove a competitiveratio that depends on the optimal value OPT,p rather than the right hand side k mini bi . To be precise,they prove a competitive ratio of (1 O( 3 m2 log(n)/OPT)). Due to the use of a dynamic pricingapproach,p we show that even in terms of OPT, our algorithm has a strictly better competitive ratio of(1 O( m2 log(n)/OPT)) (see Section 5.1 and Appendix F). We would like to emphasize here that (a)a competitive ratio in terms of OPT does not imply a corresponding competitive ratio in terms of k andvice-versa, as we illustrate by examples in Appendix F, there are cases where one may be better than theother, and (b) significantly different techniques are used in this paper to achieve a bound that dependsonly on k, and not on the value of OPT, in particular refer to the analysis in Lemma 3.2 and Lemma3.3, and its implications on the competitive ratio analysis. We believe having a dependence on k ratherthan OPT may be more attractive in many practical settings, since the value of k is known and can bechecked.More recently, and subsequent to the submission of an earlier version of our paper [2], Feldman etal. [15] (independently) obtained bounds for an online packing problem which depend both on the righthand side k and OPT. In addition to requiring lower bounds on both k and OPT, their lower bound onk depends on 3 , thus their result is much weaker than the result in this paper. Their algorithm is basedon a one-time pricing approach.Table 1 provides a comparison of these related results, clearly illustrating that our dynamic pricingalgorithm provides significant improvement over the past results. Here “Condition” refers to the lowerbound condition required on the input to achieve (1 O( )) competitive ratio, and k mini bi .In the operations research and management science community, a dynamic and optimal pricing strategy for various online resource allocation problems has always been an important research topic, some

Agrawal et al.: A Dynamic Near-Optimal Algorithm for Online Linear ProgrammingMathematics of Operations Research xx(x), pp. xxx–xxx, c 200x INFORMSConditionk Kleinberg et al., 2005 [21]This paperfor m 1OPT πmaxDevanur et al., 2009 [13]Feldman et al., 2010 [15]1 2 ,k k m log(n) 3m log(n) 22m log(n) 3and OPTπmax or OPTπmax m log(n) m2 log(n) 25TechniqueTarget problemDynamic pricingk-secretary problemOne-time pricingAdwords problemOne-time pricingonline packingDynamic pricinggeneral online LPTable 1: Comparison of some existing resultsliteratures include [14, 18, 19, 26, 22, 11, 6]. In [19, 18, 6], the arrival process are assumed to be price sensitive. However, as commented in [11], this model can be reduced to a price independent arrival processwith availability control under Poisson arrivals. Our model can be further viewed as a discrete version ofthe availability control model which is also used as an underlying model in [26] and discussed in [11]. Theidea of using a threshold - or “bid” - price is not new. It is initiated in [28, 25] and investigated furtherin [26]. In [26], the authors show that the bid price is asymptotically optimal. However, they assume theknowledge on the arrival process and therefore the price is obtained by “forecasting” the future using thedistribution information rather than “learning” from the past observations as we do in our paper. Theidea of using linear programming to find dual optimal bid price is discussed in [11] where asymptoticoptimality is also achieved. But again, the arrival process is assumed to be known which made theiranalysis relatively simple.Our work improves upon these existing works in various manners. First, we provide a common nearoptimal solution for a wide class of online linear programs which encompasses many special cases ofsecretary problems and resource allocation problems discussed above. Moreover, due to its dynamiclearning capability, our algorithm is distribution free–no knowledge on the input distribution is assumedexcept for the random order of arrival. The techniques proposed in this paper may also be considereda step forward in the threshold price learning kind of approaches. A common element in the techniquesused in existing works on secretary problems [4] (with the exception of [21]), online combinatorial auctionproblems [1], and adwords matching problem [13, 15], has been one-time learning of threshold price(s)from first few (n ) customers, which is then used to determine the decision for remaining customers.However, in practice one would expect some benefit from dynamically updating the prices as moreand more information is revealed. Dynamic pricing has been a topic of wide attention in many of thema

online auctions [4, 10], online keyword matching problems [13, 20, 23, 16], online packing problems [9], and various other online revenue management and resource allocation problems [22, 11, 6]. In all these examples mentioned above, the problem can be formulated as an online linear programming problem1. In

Related Documents:

Average k-shortest path length Load balancing property RRG is near optimal in terms of average k-shortest path length RRG is far from optimal for all other metrics GDBG was found near optimal for all metrics GDBG was used as a simulation benchmark to evaluate RRG Depending on traffic pattern, RRG is not always near optimal

Dec 06, 2018 · Dynamic Strategy, Dynamic Structure A Systematic Approach to Business Architecture “Dynamic Strategy, . Michael Porter dynamic capabilities vs. static capabilities David Teece “Dynamic Strategy, Dynamic Structure .

II. Optimal Control of Constrained-input Systems A. Constrained optimal control and policy iteration In this section, the optimal control problem for affine-in-the-input nonlinear systems with input constraints is formulated and an offline PI algorithm is given for solving the related optimal control problem.

not satisy \Dynamic Programming Principle" (DPP) or \Bellman Optimality Principle". Namely, a sequence of optimization problems with the corresponding optimal controls is called time-consistent, if the optimal strategies obtained when solving the optimal control problem at time sstays optimal when the o

Algorithms and Data Structures Marcin Sydow Desired Properties of a Good Algorithm Any good algorithm should satisfy 2 obvious conditions: 1 compute correct (desired) output (for the given problem) 2 be e ective ( fast ) ad. 1) correctness of algorithm ad. 2)complexity of algorithm Complexity of algorithm measures how fast is the algorithm

Algorithm 19 - Timer Functions 125 Algorithm 20 - Totalization 129 Algorithm 21 - Comparator 133 Algorithm 22 - Sequencer 136 Algorithm 23 - Four Channel Line Segment (Version 1.1 or Later) 152 Algorithm 24 - Eight Channel Calculator (Version 1.1 or Lat

table of contents 1.0 introduction 3 2.0 overview 7 3.0 algorithm description 8 3.1 theoretical description 8 3.1.1 physical basis of the cloud top pressure/temperature/height algorithm 9 3.1.2 physical basis of infrared cloud phase algorithm 14 3.1.3 mathematical application of cloud top pressure/temperature/height algorithm 17 3.1.4 mathematical application of the cloud phase algorithm 24

Resource Planning (CRP) methodology for algorithm designs. The resulting near-optimum algorithm operates within an extensible CRP shell and uses two pluggable scheduling and forwarding heuristics to produce near-optimal performance. In this paper we present the shell architecture, the algorithm, and the simulation

of dynamic programming and optimal control for vector-valued functions. Mathematics Subject Classi cation. 49L20, 90C29, 90C39. Received August 4, 2017. Accepted September 6, 2019. 1. Introduction: dynamic programming and optimal control It is well known that optimization is a key tool in mathemat

In this paper we present the algorithm LPDP (Longest Path by Dynamic Programming) and its parallel version. LPDP makes use of graph partitioning and dynamic pro-gramming. Unlike many other approaches for NP-complete problems, LPDP is not an approximation algorithm – it find

Dynamic Programming and Optimal Control 3rd Edition, Volume II by Dimitri P. Bertsekas Massachusetts Institute of Technology Chapter 6 Approximate Dynamic Programming This is an updated version of the research-oriented Chapter 6 on Approximate Dynamic

compared to the three previous methods. ’ Some previous algorithms achieve optimal mapping for restricted problem domains: Chortle is optimal when the input network is a tree, Chortle-crf and Chortle-d are optimal when the input network is a tree and h’ 5 6, and DAG- Map is optimal when the mapping constraint is monotone, which is true for .

tests of the optimal diet model of optimal foraging theory. Optimal foraging theory (OFT) develops hypotheses about how a species would feed if it acted in the most eco-nomical manner with respect to time and energy expenditure (MacArthur & Pianka, 1966). Hanson (1987) summarized the assumptions underlyin

errors. Following the invention of the Origin-Based Assignment (OBA) algorithm by Bar-Gera (2002), such precision is now possible; see Patriksson (1994) for an overview of these models and solution methods. The findings of the comparison of user-optimal and system-optimal route patterns presented below may be surprising as well as informative.

vector to accommodate uncertainties in the atmospheric flight path3. Pinpoint landing accuracy is defined as a . indirect optimal control algorithm with iteration required to derive the control history8. The third algorithm, originally derived by D’Souza, is a fuel-optimal algorithm which assumes . convexity ensures that a global optimum .

learning process of the adaptive algorithm. This paper proposes a novel neural-fuzzy algorithm to seek the optimal step size for LMS adaptive arrays. The proposed search algorithm consists of three processing units: the learning block, the neu

Why dynamic programming? Lagrangian and optimal control are able to deal with most of the dynamic optimization problems, even for the cases where dynamic programming fails. However, dynamic programming has become widely used because of its appealing characteristics: Recursive feature: ex

20. Write an algorithm and flowchart to find the given no is positive or not. 21. Write an algorithm and flowchart to swap the two nos. 22. Write an algorithm and flowchart to convert temperature in Celsius 23. Write an algorithm and flowchart take digit from user and display 24. Write an algorithm and flowchart enter year from user and check. 25.

Both the sum-product and the max-product algorithm have also another root in cod-ing, viz. the BCJR algorithm [5] and the Viterbi algorithm [10], which both operate on a trellis. Before the invention of turbo coding, the Viterbi algorithm used to be the workhorse of many practical coding schemes. The BCJR algorithm, despite its equally

algorithm (MA), nondominated sorting genetic algorithm II (NSGA-II), and cooperative coevolutionary nondominated sorting genetic algorithm II (CCNSGA-II). To improve the performance in genetic algorithm procedure, a xed-length encoding method is presented based on improved maze algorithm and adaptive region strategy.