Data-driven Evolutionary Bayesian Optimization

2y ago
28 Views
6 Downloads
3.68 MB
48 Pages
Last View : 14d ago
Last Download : 3m ago
Upload by : Wade Mabry
Transcription

SAMSI’2021, March 12, 2021Data-driven Evolutionary Bayesian OptimizationProfessor Yaochu Jin, IEEE FellowDepartment of Computer Science, University of Surrey,Guildford, UKyaochu.jin@surrey.ac.uk

University of SurreyResearch Excellence 5G Innovation Center Surrey Space Centre Artificial IntelligenceTRAIN to/from LondonGuildford – WaterlooDuration: 37 minutesFrom Heathrow Airport From Gatwick Airport 45 minutes by Taxi35 minutes by Taxi

Research InterestsMorphogenetic Self-organizationof Swarm robots

Outline Introduction to data-driven evolutionary optimization– Single- and multi-objective optimization– Main approaches to evolutionary multi-objective optimization– Data-driven optimization Data-driven surrogate-assisted evolutionary optimization– Basic surrogate management strategies– Evolutionary Bayesian optimization (or Bayesian evolutionaryoptimization?)– Remaining challenges Recent advances in evolutionary Bayesian optimization– Use of heterogeneous ensembles– Use of dropout deep neural networks– Knowledge transfer for Bayesian optimization Summary and future challenges

Optimization and Problem Formulation Optimization is a mathematical discipline that concerns the finding of minima andmaxima of functions (objectives), subject to constraints Objective function Decision variables Constraints

Single and Multi-Objective OptimizationSingle-objective optimization (SOO)Multi-objective optimization mize One single optimal solution can be found for SOO in most cases, whereas a finite orinfinite number equally good solutions exist for MOO To choose a final solution, user preference is necessary

Mathematical Description of MOOminimize fm (X),s.t.m 1,2, , M;gj(X) 0,j 1,2, , J;hk(X) 0, k 1,2, ,K;xiL xi xiU ,i 1,2, , n.Decision spacex2Objective spacef2x3f1x1

DominanceFor minimisation problems, solution X(1) dominates X(2) if– Solution X(1) is no worse than solution X(2) in all objectives: m 1,2, , M, fm(X(1)) fm(X(2)),– Solution X(1) is strictly better than X(2) at least in one objective: m’ 1,2, , M, fm’ (X(1)) fm’ (X(2)).minimiseCDf2 ABf1minimise A dominates C and D B is not dominated by A

Pareto-Optimal Set and Pareto Front The set of all the Pareto optimal solutions is called the Pareto set. Theimage of all Pareto optimal solutions in the objective space is termedPareto front.Decision spaceObjective spacex2f2Pareto setParetofrontx3x1f1

Shape of Pareto FrontsRegular Pareto FrontsIrregular Pareto FrontConcave10.8Degenerate f36Discontinuous PF42100.20.80.40.60.40.60.2f1Non-convex0.80 1f2Inverted PF

Evolutionary Algorithms for OptimizationEvolutionary algorithms and other meta-heuristic search methods are a class ofpopulation-based, guided stochastic search heuristics inspired from biological evolutionand swarm behaviors of social ringpopulation)Y. Jin and B. Sendhoff. A systems approach to evolutionary multi-objective structural optimization and beyond. IEEE ComputationalIntelligence Magazine, 4(3):62-76, 2009

Evolutionary Algorithms for Optimization No need for analytical objective functions Strong global search capability Less sensitive to uncertainties Well suited for multi-criterion optimisation and mix-integer optimisation Easy for parallel implementation / distributed computing Computationally intensive

Challenges in Optimization Problem formulation Large number of decision variables, multi- / many objectives, many constraints Optimization in the presence of uncertainties– Robust optimization– Dynamic optimization– Robust optimization over time Computational complexity– No analytic objective functions available, or data only– Computationally intensive– Experimentally costly PlatEMO, a comprehensive and handy software platform for evolutionaryoptimization teaching and research:https://github.com/BIMK/PlatEMOY. Tian, R. Cheng, X. Zhang, and Y. Jin. PlatEMO: A MATLAB platform for evolutionary multi-objective optimization. IEEE ComputationalIntelligence Magazine, 12(4): 73-87, 2017 (“2019 IEEE CIM Outstanding Paper Award”)

From Multi- to Many-Objective Optimization Basic approaches to multi-objective optimization– Pareto dominance based approaches– Decomposition using weight or reference vectors (cf. a scalarizing function)– Performance indicator based approachesa) Pareto dominance based b) Decomposition approachesc) Performance indicatorNew measures to be taken if the number of objectives are larger than three (so-called manyobjective optimization)– Introduction of secondary criterion in addition to dominance comparison– Introduction of reference weights / vectors / points (uninformed preferences)– Introduction of informed preferencesB. Li, J. Li, K. Tang, and X. Yao. Many-objective evolutionary algorithms: A survey. ACM Computing Surveys, 48:13–35, 2015Y. Hua, Q. Liu, K. Hao, and Y. Jin. A survey of evolutionary algorithms for multi-objective optimization problems with irregular Paretofronts. IEEE/CAA Journal of Automatica Sinica, 8(2): 303-318, 2021

Challenges in Quality Evaluation No analytic fitness function is available– Very time-consuming numerical simulations– Expensive experiments– History production data only

Evolutionary Turbine Design Optimization

Data Driven Evolutionary OptimizationActive Data CollectionY. Jin, H. Wang, T. Chugh, D. Guo, K. Miettinen. Data-driven evolutionary optimization: An overview and case studies. IEEE Transactions onEvolutionary optimization, 23(3): 442-458, 2019

Data Driven Evolutionary OptimizationY. Jin (editor). Knowledge Incorporation in Evolutionary Computation. Springer, Berlin Heidelberg, 2005

Challenges in Data Data resources– Directly collected data with the help of numerical simulations or physicalexperiments– Indirectly collected data in daily life or production– Availability of new data during optimization (online/offline)Big data– Huge amount– Heterogeneity– Noisy– Distributed and time-varyingSmall data– Very sparse data (data paucity)– Little or no new data can be generated during optimization– Noisy–

Challenges in Machine Learning Data clustering, dimension reduction, and sensitivity analysis Actively learning– Infill criteria– Model management Semi-supervised learning– A large amount of unlabeled data (unevaluated candidate solutions)– A small amount of labelled data (evaluated candidate solutions) Transfer learning– Knowledge transfer between similar optimization problems– Knowledge transfer between cheap and expensive problems Dynamic and incremental learning Ensemble learning

Surrogate-Assisted Evolutionary Optimization Use a meta-model /surrogate to replacethe expensive fitness evaluations, e.g., CFDsimulations Choose a surrogate Collect data Train the surrogate Replace the CFD

Which Surrogate Model to Use? Deterministic models Polynomials Neural networks Radial-basis-function networks Support vector machines Decision trees Stochastic models Gaussian processes (Kriging) Multi-surrogates Ensemble surrogates Multi-surrogates (global local, hierarchical surrogates) Y. Jin. A comprehensive survey of fitness approximation in evolutionary computation. Soft Computing, 9(1), 3-12, 2005 Y. Jin. Surrogate-assisted evolutionary computation: Recent advances and future challenges. Swarm and Evolutionary Computation, 1(2):61-70, 2011 R. Allmendinger, M. T. M. Emmerich, J. Hakanen, Y. Jin, and E. Rigoni. Surrogate-assisted multicriteria optimization: Complexities, prospective solutions,and business case. Journal of Multi-Criteria Decision Analysis, 14(1/2):5-25, 2017

What to Approximate? Single-objective optimization Objective function Rank Constraints (classification) Multi-objective optimization Objective functions Dominance relationship (classification) Performance indicator, e.g., hypervolume Weighted sum, augmented Tchebycheff aggregation, Euclidean distance to the nearest vector of the Pareto front

Model Management Use surrogates only: Risk of converging to a false optimum Accurate approximation is not the only target in surrogate modelling

Online Model Management Surrogate management Population-based Generation-based Individual-based Trust-region method Acquisition function (infill criteria) Only very limited new data are affordable, so which solutions should be evaluatedusing expensive simulations/experiments? Potentially good solutions (estimated by the surrogate) Solutions whose estimated quality has a large amount of uncertainty Related decision space is less explored Most effective for model improvement How to measure uncertainty? Distance to training data Gaussian processes

Bayesian Optimization – Main Steps1. Choose some prior measure over the space of possible objectives[Offline training data]2. Combine prior and the likelihood to get a posterior measure over the objective given someobservations[Predict the objective value of some new candidate solutions]3. Use the posterior to determine where to take the next evaluation according to an acquisitionfunction (A second optimization problem is to be solved!)[Model management / evolution control]4. Augment the data[Evaluation of the selected solution (s) using the expensive objective function]5. Iterate between 2 and 4 until the evaluation budget is exhausted.Iteration 3Iteration 4B. Bischl, J. Richter, J. Bossek, D. Horn, J. Thomas, M. Lang. mlrMBO: A modular framework for model-based optimization of expensive blackbox functions. https://arxiv.org/abs/1703.03373

Acquisition Functions Popular infill criteria Lower confidence bound (LCB)f (x) - (x) ( 0) Probability of Improvement (PI) Expected Improvement (EI) Thompson sampling Entropy search Balance between exploitation and exploration M. Emmerich, K.C. Giannakoglou, B. Naujoks, Single- and multiobjective evolutionary optimization assisted by Gaussian random field metamodels. IEEETransactions on Evolutionary Computation, 10 (4) : 421–439 (2006) B. Shahriari, K. Swersky, Z. Wang, R. P. Adams and N. de Freitas. Taking the human out of the loop: A review of Bayesian optimization. Proceedings ofthe IEEE, 104(1):148-175, 2016

Evolutionary BO or Bayesian EOEvolutionary Bayesian OptimizationSample Data D(0) fromObjective Function F(X)Bayesian Evolutionary OptimizationInitialization P(0)ParentPopulation P(t)Build Gaussian ProcessF (X, t)Genetic OperatorsMaximizeAcquisition Functionmax G(X,t)Sample Data F(X* (t)),D(t 1) D(t) D(X*(t))Sample DataD(0) fromObjectiveFunction ion O(t)Build GaussianProcess F (X, t)Find IndividualsThat MaximizeAcquisitionFunction G(X,t)Sample Data F(X* (t)),D(t 1) D(t) D(X*(t))S. Qin, C. Sun, Y. Jin and G. Zhang. Bayesian approaches to surrogate-assisted evolutionary multi-objective optimization: A comparativestudy. IEEE Symposium Series on Computational Intelligence, Xiamen, China, December 2019

Evolutionary BO versus Bayesian EO Evolutionary Bayesian optimization optimizes the acquisition function whileBayesian evolutionary optimization optimizes the approximated objective functionReal objectivefunctionApproximatedobjective function9th IterationAcquisition function15th IterationAcquisition function

Bayesian Optimization - Challenges Computational complexity of GP Complexity is cubic O(N 3) in the number of training data No incremental learning capability Optimization of hyper-parameters incurs additional complexity Solution of the acquisition function: the acquisition function may be even morechallenging to solve than the original one! Only an approximate optimal solution can be achieved using traditionaloptimization methods Evolutionary algorithms Not scalable to multi- many-objective optimization A large number of GPs is required if an MoP is decomposed into multiple SoPs The computational cost becomes prohibitive for many-objective optimization ifa performance indicator is used

Acquisition Functions for MoPsHow to extend acquisition functions to MoPs? Pareto approaches Optimize a multi-objective optimization problem (m acquisition functions) Directly select the non-dominated solutions for sampling Cluster the non-dominated solutions and then select a subset Decomposition approaches: All ideas in decomposing MoPs can be used Use a scalar performance indicatorBobak Shahriari, Kevin Swersky, Ziyu Wang, Ryan P. Adams and Nando de Freitas. Taking the human out of the loop: A review of Bayesianoptimization. Proceedings of the IEEE, 104(1):148-175, 2016

Recent Advances: Heterogeneous Ensembles Motivation Ensemble surrogate are able to estimate the uncertainty Capable of incremental learning, more scalable to the increase in training data Challenges The base learners of the ensemble must be both accurate and diverse How to ensure diversity without reducing the accuracy?D. Guo, Y. Jin, J. Ding, and T. Chai. Heterogeneous ensemble based infill criterion for evolutionary multi-objective optimization of expensiveproblems. IEEE Transactions on Cybernetics, 49(3):1012-1025, 2019

Accurate and Diverse Ensembles Manipulation of data Boosting Bagging AdaBoost Use of different learning algorithms Explicitly encourage of diversity in outputs Explicit diversity in model structure Different structure Different inputs Different modelsS. Gu, R. Cheng and Y. Jin. Multi-objective ensemble generation. WIREs Data Mining and Knowledge Discovery, 5(5): 234-245, 2015

Accurate and Diverse Ensembles Filter-based selection performed before optimization: minimizing the correlation between features andmaximizing correlation between features and the outputs using PSO based search PCA based feature extraction performed in each iterationS. Gu and Y. Jin. Multi-train: A semi-supervised heterogeneous ensemble classifier. Neurocomputing, 249:202-211, 2017

Use of Ensemble to Replace Gaussian ProcessesComparative results in terms of IGD

Use of Ensemble to Replace Gaussian ProcessesMean of training time of the surrogates in HeE-MOEA-LCB (the red solid lines with o)and GP-MOEA-LCB (the blue dashed lines with ) versus the number of decisionvariables on (a) DTLZ2; (b) DTLZ3; (c) WFG2; and (d) WFG3.

Recent Advances: Dropout Neural Networks Dropout neural networks are originally for alleviating over-fitting We adapt the dropout network for estimating the uncertainty in predictionD. Guo, X. Wang, K. Gao, Y. Jin, J. Ding, and T. Chai. Evolutionary optimization of high-dimensional multi- and many-objective expensiveproblems assisted by a dropout neural network. IEEE Transactions on Systems, Man and Cybernetics: Systems, 2020

Recent Advances: Dropout Neural Networks Dropout neural networks are originally for alleviating over-fitting We adapt the dropout network for estimating the uncertainty in prediction

Recent Advances: Dropout Neural Networks Train a deep NN using dropout both in training and predictionThe elements of d I (for the input layer) and d R (for the first layer) are sampledfrom a Bernoulli distribution with parameter pI and pR, respectively. Prediction and uncertainty estimation:

Recent Advances: Dropout Neural Networks Comparative results(a) DTLZ4(b) WFG1 Computational complexity(c) WFG5(d) WFG9

Recent Advances: Crude Oil Distillation

Knowledge Transfer in EBO Data is expensive – Can we use data from previously solved problems, or some“cheap” problems? Challenges in knowledge transfer Find appropriate source tasks (same domain, transferrable knowledge) Avoid negative transfer Transfer learning with domain adaptation Balanced distribution adaptation based knowledge transfer - adapts theimportance of the marginal and conditional distributions according to thesimilarity of the dataS. J. Pan and Q. Yang, “A survey on transfer learning,” IEEE Transactions on knowledge and data engineering, vol. 22, no. 10, pp. 1345–1359, 2009.

Knowledge Transfer in EBOH. Li, Y. Jin and T. Chai. Evolutionary multi-objective Bayesian optimization based on online transfer learning. IEEE Transactions onCybernetics, 2021

Knowledge Transfer in EBO

Summary and Future Work Data-driven evolutionary optimization of expensive black-box problems Acquisition functions provide principled surrogate management techniquesin data-driven evolutionary optimization Advanced machine learning techniques (ensemble learning, dropout deepneural networks and transfer learning) are very helpful in addressinglimitations of Bayesian optimization Future work Surrogate-assisted high-dimensional optimization, e.g., from 100 to 1000decision variables, and surrogate-assisted multi-scenario optimization Data-driven constrained combinatorial optimization Off-line and distributed data-driven optimization Surrogate-assisted preference driven optimization

AcknowledgementsDr Handing Wang, Dr Chaoli Sun, Dr Tinkle Chugh, Dan Guo, Cuie Yang, Dr Dudy Lim, Prof Yew SoonOng, Dr Markus Olhofer, Dr Bernhard Sendhoff, Dr Jan Jansen, Prof Kaiser MiettinenFinnish Funding Agency for Innovation

Editor-in-Chief: Yaochu JinImpact factor (2019): 3.791 Complexity- Complex evolutionary and adaptive Systems- Emergent properties and behavior in complex Systems- Self-organizing collective systems- Biological and social inspirations in problem solving- Systems Science and EngineeringIntelligent Data Analytics- Data mining and knowledge discovery- Machine learning- Pattern recognition- Big Data Analytics and data science- Data-driven problem solving Computational Simulation- Knowledge-based Systems- Agent based Systems- Uncertainty modeling- Decision support Systems- Brain-like computing- Ubiquitous computing- Computational visualization and interaction

Many thanks for your attention!Finnish Funding Agency for Innovation

Bayesian Evolutionary Optimization S. Qin, C. Sun, Y. Jin and G. Zhang. Bayesian approaches to surrogate-assisted evolutionary multi-objective optimization: A comparative study. IEEE Symposium Series on

Related Documents:

Evolutionary game optimization method based on economic game theory maps search space of optimization problem into the combinational space of game strategies and objective function into utility function. 1 In an evolutionary game, through dynamic evolutionary process of fitted individual can optimization problem be solved, each individual

evolutionary biology. From this point of view, some authors have tried to extend the Darwinian theory universally beyond the domain of evolutionary biology (Cf. Dawkins, 1983), using the three principles of evolutionary theory (inheritance or retention, variation, and adaptation) as a heuristic for evolutionary economic theorizing (Campbell, 1965).

under uncertainty, such as in portfolio optimization and robust systems design. We propose a family of novel Bayesian optimization algorithms that exploit the struc-ture of the objective function to substantially improve sampling efficiency. Instead of modeling the objective function directly as is typical in Bayesian optimization,

data into studies of eco-evolutionary dynamics can provide a better mechanistic understanding of the causes of phenotypic change, help elucidate the mechanisms driving eco-evolutionary dynamics and lead to more accurate evolutionary predictions of eco-evolutionary dynamics in nature (Fig. 2). What genomic data can reveal about

Computational Bayesian Statistics An Introduction M. Antónia Amaral Turkman Carlos Daniel Paulino Peter Müller. Contents Preface to the English Version viii Preface ix 1 Bayesian Inference 1 1.1 The Classical Paradigm 2 1.2 The Bayesian Paradigm 5 1.3 Bayesian Inference 8 1.3.1 Parametric Inference 8

value of the parameter remains uncertain given a nite number of observations, and Bayesian statistics uses the posterior distribution to express this uncertainty. A nonparametric Bayesian model is a Bayesian model whose parameter space has in nite dimension. To de ne a nonparametric Bayesian model, we have

Simple Evolutionary Optimization Can Rival Stochastic Gradient Descent in Neural Networks In: Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 2016). New York, NY: ACM Nominated for Best Paper Award in Evolutionary Machine Learning. Gregory Morse Department of Computer Science University of Central Florida Orlando, FL 32816

answer choices you are marking on your answer sheet.-4-GO ON TO THE NEXT PAGE Language Arts – Reading Time — 25 minutes 19 Questions GO ON TO THE NEXT PAGE -5-GO ON TO THE NEXT PAGE A violent storm has threatened the first voyage of the ship Nan-Shan. This excerpt from a work of fiction portrays several crew members, including the first mate, Jukes, as they confront the storm. Jukes was as .