Machine Learning Solution Methods For Multistage .

3y ago
19 Views
3 Downloads
1.15 MB
200 Pages
Last View : 5d ago
Last Download : 3m ago
Upload by : Xander Jaffe
Transcription

Machine Learning Solution Methods forMultistage Stochastic ProgrammingThesis byBoris DefournySystems and Modeling Research UnitDepartment of Electrical Engineering and Computer ScienceUniversity of Liège, Belgium2010

AbstractThis thesis investigates the following question: Can supervised learning techniques besuccessfully used for finding better solutions to multistage stochastic programs? A similarquestion had already been posed in the context of reinforcement learning, and had led toalgorithmic and conceptual advances in the field of approximate value function methodsover the years (Lagoudakis and Parr, 2003; Ernst, Geurts, and Wehenkel, 2005; Langford and Zadrozny, 2005; Antos, Munos, and Szepesvári, 2008). This thesis identifiesseveral ways to exploit the combination “multistage stochastic programming/supervisedlearning” for sequential decision making under uncertainty.Multistage stochastic programming is essentially the extension of stochastic programming (Dantzig, 1955; Beale, 1955) to several recourse stages. After an introduction tomultistage stochastic programming and a summary of existing approximation approachesbased on scenario trees, this thesis mainly focusses on the use of supervised learning forbuilding decision policies from scenario-tree approximations.Two ways of exploiting learned policies in the context of the practical issues posedby the multistage stochastic programming framework are explored: the fast evaluationof performance guarantees for a given approximation, and the selection of good scenariotrees. The computational efficiency of the approach allows novel investigations relativeto the construction of scenario trees, from which novel insights, solution approaches andalgorithms are derived. For instance, we generate and select scenario trees with randombranching structures for problems over large planning horizons. Our experiments onthe empirical performances of learned policies, compared to golden-standard policies,suggest that the combination of stochastic programming and machine learning techniquescould also constitute a method per se for sequential decision making under uncertainty,inasmuch as learned policies are simple to use, and come with performance guaranteesthat can actually be quite good.Finally, limitations of approaches that build an explicit model to represent an optimalsolution mapping are studied in a simple parametric programming setting, and variousinsights regarding this issue are obtained.

iv

AcknowledgmentsWarm thanks are addressed to my family for their love and constant support over theyears.I express my deepest gratitude to my advisor, Louis Wehenkel. In addition to hisguidance, to his questioning, and to the many discussions we had together, from technicalsubjects to scientific research in general, Louis has provided me with an outstandingenvironment for doing research and communicating it, while demonstrating, with style,his large culture, his high ethic, and his formidable ability to be present in times of need.This thesis would not have been written without Louis’ interests for supervised learning,ensemble methods, and optimal control.I am very grateful to Damien Ernst. Together, we had many discussions, for instanceon approximate dynamic programming and reinforcement learning, the cross-entropymethod, the uses of scenario-tree methods, and the ways to communicate and publishresults. Damien has repeatedly provided support and encouragements regarding thepresent work. Collaborating with Louis and Damien has been extremely energizing,productive, and life-enriching.I wish to express my gratitude to Rodolphe Sepulchre. Rodolphe too deserves creditfor setting up an excellent research environment, for instance by organizing stimulatingweekly seminars, group lunches, and offering me the opportunity to participate to activities of the Belgian network DYSCO or to activities hosted by CESAME/INMA(UCL).Warm thanks to Jean-Louis Lilien for his good advice and support over the years.With hindsight, I owe to Jean-Louis many important choices that I am happy to havemade.I would like to thank Yves Crama for setting up discussions on multistage stochasticprogramming and models in logistics, and inviting me to presentations given by his group.I would like to thank Quentin Louveaux for several stimulating discussions.I express my gratitude to Mania Pavella for her interest and her repeated encouragements.I am grateful to Jacek Gondzio (University of Edinburgh), Rüdiger Schultz (Universityof Duisburg-Hessen), Werner Römisch (Humboldt University), and Alexander Shapiro(Georgia Tech) for stimulating discussions on stochastic programming, and expressionsof interest in the general orientation of this research. The scientific responsibility of thepresent work and the views expressed in the thesis rest with us.I am grateful to Rémi Munos (INRIA Lille) and Olivier Teytaud (INRIA Saclay) forstimulating discussions related to machine learning and sequential decision making.I am grateful to Roman Belavkin (Middlesex University) for stimulating discussions

vion cognitive sciences, on finance, and for inviting me to give a seminar in his group.I am grateful to Jovan Ilic and Marija Ilic (Carnegie Mellon University) for meetingsand discussions from which my interest in risk-aware decision making dates back.My interest in multistage stochastic programming originates from meetings with members of the department OSIRIS (Optimisation, Simulation, Risque et Statistiques) atElectricité de France. I would like to thank especially Yannick Jacquemart, Kengy Barty,Pascal Martinetto, Jean-Sébastien Roy, Cyrille Strugarek, and Gérald Vignal for inspiringdiscussions on sequential decision making models and current challenges in the electricityindustry. The scientific responsibility of the present work and the views expressed in thethesis rest with us.Warm thanks to my friends, colleagues and past post-docs from the Systems andModeling research unit and beyond. I had innumerable scientific and personal discussions with Alain Sarlette and Michel Journée, and during their post-doc time with EmreTuna and Silvère Bonnabel. I had wonderful time and discussions with Pierre Geurts,Vincent Auvray, Guy-Bart Stan, Renaud Ronsse, Christophe Germay, Maxime Bonjean, Luca Scardovi, Denis Efimov, Christian Lageman, Alexandre Mauroy, Gilles Meyer,Marie Dumont, Pierre Sacré, Christian Bastin, Guillaume Drion, Anne Collard, LauraTrotta, Bertrand Cornélusse, Raphaël Fonteneau, Florence Belmudes, François Schnitzler, François Van Lishout, Gérard Dethier, Olivier Barnich, Philippe Ries, and AxelBodart. I extend my acknowledgements to Thierry Pironet and Yasemin Arda (HECUlg). Thanks also to Patricia Rousseaux, Thierry Van Cutsem and Mevludin Glavic.Many thanks to my friends who encouraged me to pursue this work, and in particular toEstelle Derclaye (University of Nottingham).I am also indebted to people who helped me when I was abroad for conferences andmade my time there even more enjoyable: Marija Prica, Guillaume Obozinski, JanneKettunen, Aude Piette, Diana Chan, Daniel Bartz, Kazuya Haraguchi.I had the opportunity to be invited to be a coauthor to papers by Sourour Ammar, Philippe Leray (INSA Rouen) and Louis Wehenkel, and to a paper by BertrandCornélusse, Gérald Vignal and Louis Wehenkel, and I wish to thank these authors forthese collaborations.I gratefully acknowledge the financial support of the Belgian network DYSCO (Dynamical Systems, Control, and Optimization), funded by the Interuniversity AttractionPoles Programme, initiated by the Belgian State, Science Policy Office. The scientificresponsibility rests with us. This work was supported in part by the IST Programmeof the European Union, under the PASCAL2 Network of Excellence, IST-2007-216886.This thesis only reflects our views.Finally, I would like to express my extreme gratitude to the members of my thesisdefense committee: Rodolphe Sepulchre (Chair), Louis Wehenkel (Advisor), Yves Crama,Quentin Louveaux, Shie Mannor (Technion), Werner Römisch (Humboldt University),Alexander Shapiro (Georgia Tech), and Olivier Teytaud (INRIA Saclay/Université ParisSud).

ContentsAbstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .iiiAcknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .iv1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11.1Outline of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .21.2Published Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .32. The Multistage Stochastic Programming Framework . . . . . . . . . . . . . . .52.12.22.32.4Description of the Framework . . . . . . . . . . . . . . . . . . . . . . . . .52.1.1From Nominal Plans to Decision Processes . . . . . . . . . . . . .52.1.2Incorporating Probabilistic Reasoning . . . . . . . . . . . . . . . .62.1.3The Elements of the General Decision Model . . . . . . . . . . . .62.1.4The Tree Representation of Gradually Revealed Scenarios . . . . .92.1.5Approximating Random Processes with Scenario Trees . . . . . . .102.1.6Simple Example of Formulation . . . . . . . . . . . . . . . . . . . .11Comparison to Related Approaches . . . . . . . . . . . . . . . . . . . . . .132.2.1The Exogenous Nature of the Random Process . . . . . . . . . . .132.2.2Comparison to Markov Decision Processes . . . . . . . . . . . . . .142.2.3The Curse of Dimensionality . . . . . . . . . . . . . . . . . . . . .14The Value of Multistage Stochastic Programming . . . . . . . . . . . . . .152.3.1Reduction to Model Predictive Control. . . . . . . . . . . . . . .162.3.2Reduction to Two-Stage Stochastic Programming . . . . . . . . . .162.3.3Reduction to Heuristics based on Parametric Optimization . . . .17Practical Scenario-Tree Approaches . . . . . . . . . . . . . . . . . . . . . .202.4.1Approximation Methods in Two-stage Stochastic Programming . .202.4.2Challenges in the Generation of Scenario Trees . . . . . . . . . . .222.4.3The Need for Testing Scenario-Tree Approximations . . . . . . . .252.4.4The Inference of Shrinking-Horizon Decision Policies . . . . . . . .262.4.5Alternatives to the Scenario Tree Approach . . . . . . . . . . . . .27

viiiContents2.5Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .273. Solution Averaging in Structured Spaces . . . . . . . . . . . . . . . . . . . . . .293.1The Perturb-and-Combine Paradigm . . . . . . . . . . . . . . . . . . . . .293.1.1Maximum Likelihood Estimation . . . . . . . . . . . . . . . . . . .303.1.2Bayesian Averaging . . . . . . . . . . . . . . . . . . . . . . . . . .323.1.3Mixture Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . .333.1.4Model Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . .333.1.5Bayesian Model Averaging . . . . . . . . . . . . . . . . . . . . . . .343.1.6Ensemble Methods . . . . . . . . . . . . . . . . . . . . . . . . . . .34Adaptation to Stochastic Programming . . . . . . . . . . . . . . . . . . .363.2.1Principle of the Approach . . . . . . . . . . . . . . . . . . . . . . .373.2.2Generation of an Ensemble of Incomplete Trees . . . . . . . . . . .393.2.3Optimization with the Cross-Entropy method . . . . . . . . . . . .403.2.4Aggregation of First-Stage Decisions . . . . . . . . . . . . . . . . .42Numerical Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . .443.3.1Description of the Test Problem . . . . . . . . . . . . . . . . . . .443.3.2Particular Choices . . . . . . . . . . . . . . . . . . . . . . . . . . .453.3.3Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . .47Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .484. Validation of Solutions and Scenario Tree Generation . . . . . . . . . . . . . . .513.23.33.44.1Motivation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .514.2Learning and Evaluation of Scenario Tree Based Policies . . . . . . . . . .524.2.1Description of the Validation Method . . . . . . . . . . . . . . . .534.2.2Complexity Analysis . . . . . . . . . . . . . . . . . . . . . . . . . .554.2.3Other Validation Strategies . . . . . . . . . . . . . . . . . . . . . .58Monte Carlo Selection of Scenario Trees . . . . . . . . . . . . . . . . . . .604.3.1Description of the Selection Scheme . . . . . . . . . . . . . . . . .604.3.2Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .62Case Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .634.4.1Description of the Problem . . . . . . . . . . . . . . . . . . . . . .634.4.2Algorithms for Generating Small Scenario Trees. . . . . . . . . .644.4.3Algorithm for Learning Policies . . . . . . . . . . . . . . . . . . . .674.4.4Solving Programs Approximately by Linear Programming . . . . .704.4.5Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . .71Time Inconsistency and Bounded Rationality Limitations . . . . . . . . .754.34.44.5

Contentsix4.5.1Time-Consistent Decision Processes . . . . . . . . . . . . . . . . .754.5.2Limitations of Validations Based on Learned Policies . . . . . . . .77Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .785. Inferring Decisions from Predictive Densities . . . . . . . . . . . . . . . . . . .794.65.15.25.35.4Constrained MAP Repair Procedure . . . . . . . . . . . . . . . . . . . . .795.1.1Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .805.1.2Particularizations . . . . . . . . . . . . . . . . . . . . . . . . . . . .81Gaussian Predictive Densities . . . . . . . . . . . . . . . . . . . . . . . . .845.2.1Joint Gaussian Model . . . . . . . . . . . . . . . . . . . . . . . . .845.2.2Gaussian Process Model . . . . . . . . . . . . . . . . . . . . . . . .86Case Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .895.3.1Description of the Test Problem . . . . . . . . . . . . . . . . . . .905.3.2Discretization of the Random Process . . . . . . . . . . . . . . . .935.3.3Shrinking-Horizon Policies on the Test Sample . . . . . . . . . . .975.3.4Performances of Learned Policies . . . . . . . . . . . . . . . . . . .98Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1026. Learning Projections on Random Polyhedra . . . . . . . . . . . . . . . . . . . . 1056.1Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1066.2Geometry of Euclidian Projections . . . . . . . . . . . . . . . . . . . . . . 1076.3Properties of Optimal Solutions . . . . . . . . . . . . . . . . . . . . . . . . 1106.4Classifiers for Sets of Active Constraints . . . . . . . . . . . . . . . . . . . 1206.56.66.4.1Description of the Algorithms. . . . . . . . . . . . . . . . . . . . . 1206.4.2Complexity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122Numerical Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1256.5.1Description of the Test Problems . . . . . . . . . . . . . . . . . . . 1256.5.2Description of the Experiments . . . . . . . . . . . . . . . . . . . . 1256.5.3Discussion of the Results. . . . . . . . . . . . . . . . . . . . . . . . 126Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1297. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1337.1Summary of Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . 1337.2Perspectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1377.3Influences on Research in Machine Learning . . . . . . . . . . . . . . . . . 140A. Elements of Variational Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 143A.1 Minimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143

xContentsA.2 Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144A.3 Semicontinuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145A.4 Attainment of a Minimum . . . . . . . . . . . . . . . . . . . . . . . . . . . 146A.5 Epigraph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147A.6 Epi-convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148A.7 Convergence in Minimization . . . . . . . . . . . . . . . . . . . . . . . . . 149A.8 Pointwise, Continuous and Uniform Convergence . . . . . . . . . . . . . . 150A.9 Parametric Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151A.10 Convexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152A.11 Lipschitz Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153B. Basic Probability Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155B.1 The Probability Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155B.2 Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157B.3 Random Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159B.4 Random Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161B.5 Expectation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162B.6 Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164C. Elements of Functional Analysis for Kernel Methods . . . . . . . . . . . . . . . 165C.1 Hilbert Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165C.2 Linear Mappings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167C.3 Reproducing Kernel Hilbert Spaces . . . . . . . . . . . . . . . . . . . . . . 167C.4 Positive Definite Kernels . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169D. Structural Results for Two-Stage Stochastic Programming . . . . . . . . . . . . 171D.1 Problem Statement and Assumptions . . . . . . . . . . . . . . . . . . . . . 171D.2 Structural Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189

Chapter 1IntroductionMultistage stochastic programming has attracted a lot of interest over the last years,as a promising framework for formulating sequential decision making problems underuncertainty. Several potential applications of the framework are often cited: Capacity planning: finding the location and size of units or equipment, such aspower plants or telecommunication relays. Production planning: selecting components to produce, allocating components tomachines, managing stocks. Transportation and logistics: sending trucks and deliver goods. Financial management: balancing a portfolio of assets and liabilities according tomarket conditions and subject to regulatory constraints.In these applications, uncertainty may refer to the evolution of the demand for goods orservices, temperature and rainfall patterns affecting consumption or production, interest rates affecting the burden of debt, . . . Under growing environmental stress, resourcelimitations, concentration of populations in cities, many believe that these applicationscan only get a higher societal impact in the future, and that even better quantitativemethods for tackling them are needed, especially methods able to take into account alarge number of constraints.In general, problems where a flexible plan of successive decisions has to be implemented, under uncertainties described by a probabilistic model, can be formulated as amultistage stochastic program (Chapter 2). However, scalable numerical solution algorithms are not always available, so that restrictions to certain classes of programs andthen further approximations are needed.Interestingly, the approximations affect primarily the representation of the un

Multistage stochastic programming is essentially the extension of stochastic program-ming (Dantzig, 1955; Beale, 1955) to several recourse stages. After an introduction to multistage stochastic programming and a summary of existing approximation approaches based on scenario trees, this thesis mainly focusses on the use of supervised learning for

Related Documents:

Bruksanvisning för bilstereo . Bruksanvisning for bilstereo . Instrukcja obsługi samochodowego odtwarzacza stereo . Operating Instructions for Car Stereo . 610-104 . SV . Bruksanvisning i original

10 tips och tricks för att lyckas med ert sap-projekt 20 SAPSANYTT 2/2015 De flesta projektledare känner säkert till Cobb’s paradox. Martin Cobb verkade som CIO för sekretariatet för Treasury Board of Canada 1995 då han ställde frågan

service i Norge och Finland drivs inom ramen för ett enskilt företag (NRK. 1 och Yleisradio), fin ns det i Sverige tre: Ett för tv (Sveriges Television , SVT ), ett för radio (Sveriges Radio , SR ) och ett för utbildnings program (Sveriges Utbildningsradio, UR, vilket till följd av sin begränsade storlek inte återfinns bland de 25 största

Hotell För hotell anges de tre klasserna A/B, C och D. Det betyder att den "normala" standarden C är acceptabel men att motiven för en högre standard är starka. Ljudklass C motsvarar de tidigare normkraven för hotell, ljudklass A/B motsvarar kraven för moderna hotell med hög standard och ljudklass D kan användas vid

LÄS NOGGRANT FÖLJANDE VILLKOR FÖR APPLE DEVELOPER PROGRAM LICENCE . Apple Developer Program License Agreement Syfte Du vill använda Apple-mjukvara (enligt definitionen nedan) för att utveckla en eller flera Applikationer (enligt definitionen nedan) för Apple-märkta produkter. . Applikationer som utvecklas för iOS-produkter, Apple .

Machine learning has many different faces. We are interested in these aspects of machine learning which are related to representation theory. However, machine learning has been combined with other areas of mathematics. Statistical machine learning. Topological machine learning. Computer science. Wojciech Czaja Mathematical Methods in Machine .

decoration machine mortar machine paster machine plater machine wall machinery putzmeister plastering machine mortar spraying machine india ez renda automatic rendering machine price wall painting machine price machine manufacturers in china mail concrete mixer machines cement mixture machine wall finishing machine .

och krav. Maskinerna skriver ut upp till fyra tum breda etiketter med direkt termoteknik och termotransferteknik och är lämpliga för en lång rad användningsområden på vertikala marknader. TD-seriens professionella etikettskrivare för . skrivbordet. Brothers nya avancerade 4-tums etikettskrivare för skrivbordet är effektiva och enkla att