Nonlinear Network Structures For Optimal Control

2y ago
10 Views
2 Downloads
539.63 KB
63 Pages
Last View : 2m ago
Last Download : 3m ago
Upload by : Ronnie Bonney
Transcription

Nonlinear Network Structures forOptimal ControlCheng Tao & Frank L. LewisAdvanced Controls & Sensors GroupAutomation & Robotics Research Institute (ARRI)The University of Texas at ArlingtonSlide 1

Neural Network Solution for Fixed-Final TimeOptimal Control of Nonlinear SystemsCheng TaoFrank L. LewisAutomation & Robotics Research Institute (ARRI)The University of Texas at ArlingtonECC 07 Kos

Neural Network Robot ControllerUniversal Approximation PropertyFeedback linearization.qdNonlinear Inner LoopFeedforward Loopqd f(x)e[Λ I]rKvτqRobot SystemRobust Control v(t)TermPD Tracking LoopProblem- Nonlinear in the NN weights sothat standard proof techniques do not workEasy to implement with a few more lines of codeLearning feature allows for on-line updates to NN memory as dynamics changeHandles unmodelled dynamics, disturbances, actuator problems such as frictionNN universal basis property means no regression matrix is neededNonlinear controller allows faster & moreprecise motionSlide 3

Sponsored byNSF- Paul WerbosARO- Randy Zachery4 US PatentsSlide 4

Problem- Nonlinear in the NN weights sothat standard proof techniques do not workNew book by Jay Farrell and Marios PolycarpouAdaptive Approximation Based ControlSlide 5

Optimality in Biological SystemsCell HomeostasisThe individual cell is a complexfeedback control system. It pumpsions across the cell membrane tomaintain homeostatis, and has onlylimited energy to do so.Permeability control of the cell membraneCellular /index.htmlSlide 6

ARRI Research Roadmap in Neural Networks3. Approximate Dynamic Programming – 2006Nearly Optimal ControlBased on recursive equation for the optimal valueUsually Known system dynamics (except Q learning)The Goal – unknown dynamicsExtend adaptive control toyield OPTIMAL controllers.On-line tuningNo canonical form needed.Optimal Adaptive Control2. Neural Network Solution of Optimal Design Equations – 2002-2006Nearly Optimal ControlBased on HJ Optimal Design EquationsKnown system dynamicsPreliminary Off-line tuningNearly optimal solution ofcontrols design equations.No canonical form needed.1. Neural Networks for Feedback Control – 1995-2002Extended adaptive controlBased on FB Control ApproachUnknown system dynamicsto NLIP systemsOn-line tuningNo regression matrixSlide 7NN- FB lin., sing. pert., backstepping,force control, dynamic inversion, etc.

Objective and Significance Provide a tool to solve finite-horizon continuous-time optimal controlproblems for nonlinear systems. Continuous time finite horizon optimal control problems appear applicationsin which people use model predictive control (receding horizon control).Slide 8

Outline:1. Fixed-Final Time Optimal Control of Nonlinear Systems Using NeuralNetwork HJB Approach2. Neural Network Solution for Finite-Final Time H-InfinityFeedback ControlState3. Neural Network Solution for Fixed-Final time Constrained OptimalControl This research was supported by NSF grant ECS-0140490 andARO grant DAAD 19-02-1-0366.Slide 9

Review of Related work and MotivationApproximate HJB solutionsMunos et. al [65]Constrained-input optimizationSussmann, Sontag and yang [84](Gradient descent approaches)Kim, Lewis and Dawson [47](NNs)Huang and Lin [44](Taylor series expansion)NN applications to an optimalcontrolMiller [63](NNs for control)Bernstein [15]Dolphus [33]Abu-Khalaf, M [1](Infinity horizon)Unconstrained policy iteration withfinite-time horizonBeard[11]Parisini and Zoppoli [70](Infinite horizon)Slide 10

Background on Fixed-Final-Time HJB Optimal ControlNonlinear dynamical systemx f ( x ) g ( x )u (t )(1).wheren mx ℜ , f ( x) ℜ , g ( x) ℜand the input u (t ) R mnnIt is desired to find the control u that minimizes a generalized nonquadratic functionalV ( x(t 0 ), t 0 ) φ ( x(t f ), t f ) [Q( x) W (u )]dtttf0with Q (x) , W (u ) positive definite on ΩSlide 11(2)

Background on Fixed-Final-Time HJB Optimal ControlAn infinitesimal equivalent to (2) is V ( x, t ) V (x, t ) L ( f ( x) g( x)u(t ))x t T(3)where L Q(x) W (u) . This is a time-varying partial differential equation with V (x, t )the cost function for any given u(t ) and is solved backwards in time from t t f .By setting t 0 t fin (2) its boundary condition isV (x(t f ), t f ) φ (x(t f ), t f )Slide 12(4)

Background on Fixed-Final-Time HJB Optimal ControlAccording to Bellman’s optimality principle, the optimal cost is given by* T V ( x, t ) V ( x, t ) ( f ( x ) g ( x )u ( x )) min L u (t ) t x which yields the optimal control*1 1T V ( x, t )*u (x ) R g (x )2 x*(5)(6)where V * ( x, t ) is the optimal value function.Substituting (6) into (5) yields the well-known time-varyingHamilton-Jacobi-Bellman (HJB) equation V ( x, t ) V ( x, t )1 V ( x, t )f (x ) Q(x ) x t x4***TSlide 13g ( x )R g ( x ) 1T V ( x, t ) 0 (7) x*

Background on Fixed-Final-Time HJB Optimal ControlThen (5) becomes(HJB V ( x, t ) ) V ( x, t ) V ( x, t )f (x ) Q(x ) t x**1 V ( x, t )T V ( x, t ) 0g ( x )R 1 g ( x )4 x x*T *(8)If this HJB equation can be solved for the value function V (x, t ) , then the optimalcontrol is1T V ( x, t )u (x ) R 1 g (x ) x2**Slide 14

Nonlinear Fixed-Final-Time HJB Solution by NN Least-Squares ApproximationNN Approximation of the Cost Function V (x, t )In Sandberg [78], it is shown that NNs with time-varyingweights can be used touniformly approximate continuous time-varying functions.[]Using the following equation to approximate V (x, t ) for t t 0 , t f on a compactset Ω ℜ nLV L ( x, t ) w j (t )σ j ( x ) wLT (t )σ L ( x )j 1The NN weights are w j (t ) and L is the number of hidden-layer neurons.σ L ( x ) [σ 1 ( x )σ 2 ( x ).σ L ( x )] is the vector of activation function.Tw L (t ) [w1 (t )w 2 (t ).w L (t )] is the vector of NN weights.TSlide 15(9)

Nonlinear Fixed-Final-Time HJB Solution by NN Least-Squares ApproximationNote:The set σ j ( x ) is selected to be independent. Then without loss of generality, they canbe assumed to be orthonormal, i.e. select equivalent basis functions to σ j ( x ) that are also orthonormal. The orthonormality of the set {σ j ( x )}1 on Ω implies that if afunction ψ ( x, t ) L2 (Ω ) then ψ (x, t ) ψ ( x, t ),σ j (x ) Ω σ j (x )j 1wheref,gΩ f gdxΩis inner product.Slide 16

Nonlinear Fixed-Final-Time HJB Solution by NN Least-Squares ApproximationNote that V L ( x, t ) σ TL (x )w L (t ) σ TL (x )w L (t ) x x(10)where σ L ( x ) is the Jacobian σ L ( x ) x, and that V L ( x, t ) TL (t )σ L ( x ) w t(11)Therefore approximating V ( x, t ) by V L ( x, t ) uniformly in in the HJB equation (8)results in TL (t )σ L ( x ) w TL (t ) σ L ( x ) f ( x ) w1 w TL (t )σ L ( x )g ( x )R 1 g T ( x )σ TL ( x )w L (t )4 Q ( x ) eL ( x , t )Slide 17(12)

Nonlinear Fixed-Final-Time HJB Solution by NN Least-Squares ApproximationorL HJB V L ( x, t ) w j (t )σ j (x ) e L (x, t )j 1 (13)where e L (x, t ) is a residual equation error. The corresponding optimal control input is1T V ( x, t )u (x ) R 1 g (x ) x2** R 1 g T σ TL ( x )w L (t )(14)To find the least-squares solution for w L (t ) , the method of weighted residuals is used e L ( x, t ), e L ( x, t ) L (t ) w 0ΩSlide 18

Nonlinear Fixed-Final-Time HJB Solution by NN Least-Squares Approximation L (t ) w σ L ( x), σ L ( x) 1Ω σ L ( x) f ( x), σ L ptimal ControlWhen (22) is used, (5) becomes**T u()Vxt V ( x, t ), T ( f ( x, t ) g ( x )u ( x )) min Q( x ) 2 φ (v )Rdv 0u (t ) t x Minimizing the Hamiltonian of the optimal control problem with regard to u gives V ( x, t )g (x ) 2φ 1 (u * ) 0 x*Tso* 1 1T V ( x, t ) u ( x ) φ R g ( x ) x 2*Slide 53u U ℜ m(23)

Neural Network Solution for Fixed-Final time Constrained Optimal ControlHJB equation(HJB V (x, t )*) V ( x, t ) V (x, t ) t x 2 φ T (v )Rdv u0*T* V ( x, t ) x*Tf (x ) 1T V ( x, t ) g ( x ) φ R 1 g (x ) x 2* Q(x ) 0 (24)If this HJB equation can be solved for the value function V (x, t ) , then (24) gives theoptimal constrained control.Slide 54

Neural Network Solution for Fixed-Final time Constrained Optimal ControlSo that L (t ) σ L ( x ), σ L ( x )w σ L ( x ), σ L ( x ) σ L ( x ), σ L ( x ) σ L ( x ), σ L ( x ) 1ΩΩ σ L ( x ) f ( x ), σ L ( x )2 φ T (v )Rdv, σ L ( x )Ω 1Ω w L (t )u0 1Ω 1Ω 1 w (t ) σ L ( x ) g ( x ) φ R 1 g T ( x ) σ TL (x )w L (t ) , σ L (x ) 2 (25)TL Q( x ), σ L ( x )ΩSlide 55Ω

Neural Network Solution for Fixed-Final time Constrained Optimal ControlOptimal Algorithm Based on NN Approximation(12) can be converted to L (t ) A T Bw L (t ) A T C A T Dw L (t ) A T E 0 A T Aw(26)then L (t ) (AT A) AT Bw L (t ) (AT A) ATw 1 1 (A A) A Dw L (t ) (A A) A E 1TTT 1(27)TThis is a nonlinear ODE that can easily be integrated backwards using finalconditionw L (t f)to find the least-squares optimal NN weights.Slide 56

Neural Network Solution for Fixed-Final time Constrained Optimal ControlNumerical Examplesa) Linear Systemx 1 2 x1 x 2 x3x 2 x1 x 2 u 2x 3 x3 u1u1 5u 2 20(28)To find a nearly optimal time-varying controller, the following smooth functionis used to approximate the value function of the systemV ( x1 , x 2 ) w1 x12 w2 x 22 w3 x 32 w4 x1 x 2 w5 x1 x 3 w6 x 2 x 3Slide 57(29)

State 220-3100051015Time2025-430Fig. 15 Constrained Linear System Weights051020u1u2151050-5-100515Time202530Fig. 16 State Trajectory of Linear System with BoundsOptimal Control with BoundsControl InputW0401015Time202530Fig. 17 Optimal NN Control Law with BoundsSlide 58

Neural Network Solution for Fixed-Final time Constrained Optimal Controlb) Nonlinear Chained Systemx 1 u1x 2 u 2u1 1x 3 x1u 2u2 2(30)Selecting the smooth approximating functionV ( x1 , x 2 , x3 ) w1 x12 w2 x 22 w3 x32 w4 x1 x 2 w5 x1 x3 w6 x 2 x3 w7 x14 w8 x 24 w9 x34 w10 x12 x 22 w11 x12 x32 w12 x 22 x32 w13 x12 x 2 x3 w x x x3 w15 x1 x 2 x w x x 2 w x x w x x w x x214 1 223316 1317 1 3 w20 x 2 x33 w21 x 23 x3Slide 59318 1 2319 1 3(31)

Neural Network Solution for Fixed-Final time Constrained Optimal ControlNN WeightsState 015Time2025300510Optimal Control with Constrains0.5u1u20-0.5-1-1.5-2015Time2025Fig. 19 State Trajectory of Nonlinear SystemFig. 18 Nonlinear System WeightsControl Input-4x1x2x351015Time2025Fig. 20 Optimal NN Constrained Control LawSlide 60

C) Simulation-BenchmarkProblemNearly Optimal Controller State TrajectoriesNearly Optimal Controller State Trajectories1.53rtheta2rdotthetadot110.50-1x 2,x 4x 1,x 30-2-0.5-3-1-4-1.5-5-60102030405060Time in seconds708090-2100r θ State TrajectoriesFig. 0Fig. 23405060Time in seconds70u (t ) Control Input405060Time in seconds708090100r θ State Trajectories4310305-0.2020Nearly Optimal Controller Cost with Constrains0.5-0.510Fig. 22Nearly Optimal Controller with Constrainscontrol080901000102030405060Time in seconds7080Fig. 24 Disturbance AttenuationSlide 6190100

Overview of the Method Neural networks are used to approximately solve the finite-horizon optimalstate feedback control problem The method is based on solving a related Hamilton-Jacobi equation of thecorresponding finite-horizon problem Transform the problem into solving an ODE equation backwards in time. Neural network approximation converges uniformly to the function and theresulting controller provides closed-loop stability. The result is a nearly exact feedback controller with time-varyingcoefficients. No policy iteration needed.Slide 62

Slide 63

1. Fixed-Final Time Optimal Control of Nonlinear Systems Using Neural Network HJB Approach 2. Neural Network Solution for Finite-Final Time H-Infinity State Feedback Control 3. Neural Network Solution for Fixed-Final time Constrained Optimal Control This research was supported

Related Documents:

Bruksanvisning för bilstereo . Bruksanvisning for bilstereo . Instrukcja obsługi samochodowego odtwarzacza stereo . Operating Instructions for Car Stereo . 610-104 . SV . Bruksanvisning i original

10 tips och tricks för att lyckas med ert sap-projekt 20 SAPSANYTT 2/2015 De flesta projektledare känner säkert till Cobb’s paradox. Martin Cobb verkade som CIO för sekretariatet för Treasury Board of Canada 1995 då han ställde frågan

service i Norge och Finland drivs inom ramen för ett enskilt företag (NRK. 1 och Yleisradio), fin ns det i Sverige tre: Ett för tv (Sveriges Television , SVT ), ett för radio (Sveriges Radio , SR ) och ett för utbildnings program (Sveriges Utbildningsradio, UR, vilket till följd av sin begränsade storlek inte återfinns bland de 25 största

Hotell För hotell anges de tre klasserna A/B, C och D. Det betyder att den "normala" standarden C är acceptabel men att motiven för en högre standard är starka. Ljudklass C motsvarar de tidigare normkraven för hotell, ljudklass A/B motsvarar kraven för moderna hotell med hög standard och ljudklass D kan användas vid

LÄS NOGGRANT FÖLJANDE VILLKOR FÖR APPLE DEVELOPER PROGRAM LICENCE . Apple Developer Program License Agreement Syfte Du vill använda Apple-mjukvara (enligt definitionen nedan) för att utveckla en eller flera Applikationer (enligt definitionen nedan) för Apple-märkta produkter. . Applikationer som utvecklas för iOS-produkter, Apple .

Nonlinear Finite Element Analysis Procedures Nam-Ho Kim Goals What is a nonlinear problem? How is a nonlinear problem different from a linear one? What types of nonlinearity exist? How to understand stresses and strains How to formulate nonlinear problems How to solve nonlinear problems

Third-order nonlinear effectThird-order nonlinear effect In media possessing centrosymmetry, the second-order nonlinear term is absent since the polarization must reverse exactly when the electric field is reversed. The dominant nonlinearity is then of third order, 3 PE 303 εχ The third-order nonlinear material is called a Kerr medium. P 3 E

in Prep Course Lesson Book A of ALFRED'S BASIC PIANO LIBRARY. It gives the teacher considerable flexibility and is intended in no way to restrict the lesson procedures. FORM OF GUIDE The Guide is presented basically in outline form. The relative importance of each activity is reflected in the words used to introduce each portion of the outline, such as EMPHASIZE, SUGGESTION, IMPORTANT .