Integral Concurrent Learning: Adaptive Control With .

3y ago
49 Views
2 Downloads
729.34 KB
13 Pages
Last View : 15d ago
Last Download : 3m ago
Upload by : Amalia Wilborn
Transcription

Received: 24 April 2018Revised: 20 September 2018Accepted: 24 September 2018DOI: 10.1002/acs.2945SPECIAL ISSUE ARTICLEIntegral concurrent learning: Adaptive control withparameter convergence using finite excitationAnup Parikh1Rushikesh Kamalapurkar21Robotics Research & Development,Sandia National Laboratories,Albuquerque, New Mexico2School of Mechanical and AerospaceEngineering, Oklahoma State University,Stillwater, Oklahoma3Department of Mechanical andAerospace Engineering, University ofFlorida, Gainesville, FloridaCorrespondenceAnup Parikh, Robotics Research &Development, Sandia NationalLaboratories, Albuquerque, NM 87123.Email: aparikh@sandia.govFunding informationNEEC, Grant/Award Number:N00174-18-1-0003; AFOSR, Grant/AwardNumber: FA9550-18-1-0109; Office ofNaval Research, Grant/Award Number:N00014-13-1-0151; NSF, Grant/AwardNumber: 1509516 and 1762829Warren E. Dixon3SummaryConcurrent learning (CL) is a recently developed adaptive update scheme thatcan be used to guarantee parameter convergence without requiring persistentexcitation. However, this technique requires knowledge of state derivatives,which are usually not directly sensed and therefore must be estimated. A novelintegral CL method is developed in this paper that removes the need to estimate state derivatives while maintaining parameter convergence properties.Data recorded online is exploited in the adaptive update law, and numerical integration is used to circumvent the need for state derivatives. The noveladaptive update law results in negative definite parameter error terms in theLyapunov analysis, provided an online-verifiable finite excitation condition issatisfied. A Monte Carlo simulation illustrates improved robustness to noisecompared to the traditional derivative formulation. The result is also extendedto Euler-Lagrange systems, and simulations on a two-link planar robot demonstrate the improved performance compared to gradient-based adaptation laws.K E Y WO R D Sadaptive control, nonlinear systems, system identification, uncertain systems1I N T RO DU CT IONAdaptive control methods provide a technique to achieve a control objective despite uncertainties in the system model.Adaptive estimates are developed through insights from a Lyapunov-based analysis as a means to yield a desired objective. Although a regulation or tracking objective can be achieved with this scheme, it is well known that the parameterestimates may not approach the true parameters using a least-squares or gradient-based online update law without persistent excitation (PE).1-3 However, the PE condition cannot be guaranteed a priori for nonlinear systems and is difficultto check online, in general.Motivated by the desire to learn the true parameters, or at least to gain the increased robustness and improved transientperformance that parameter convergence provides (see the works of Duarte and Narendra,4 Krsticฬ et al,5 and Chowdharyand Johnson6 ), a new adaptive update scheme known as concurrent learning (CL) was recently developed in the pioneering works of Chowdhary et al.6-8 The principle idea of CL is to use recorded input and output data of the systemdynamics to apply batch-like updates to the parameter estimate dynamics. These updates yield a negative definite parameter estimation error term in the stability analysis, which allows parameter convergence to be established provided afinite excitation condition is satisfied. The finite excitation condition is an alternative condition compared to PE andonly requires excitation for a finite amount of time. Furthermore, the condition can be checked online by verifyingthe positivity of the minimum singular value of a function of the regressor matrix, as opposed to PE, which cannot beInt J Adapt Control Signal Process. /acs 2018 John Wiley & Sons, Ltd.1775

1776PARIKH ET AL.verified online, in general, for nonlinear systems. However, all current CL methods require that the output data includethe state derivatives, which may not be available for all systems. Since the naive approach of finite difference of the statemeasurements leads to noise amplification, and since only past recorded data, opposed to real-time data, is needed for CL,techniques such as online state derivative estimation or smoothing have been employed, eg, the works of Mรผhlegg et al9and Kamalapurkar et al.10 However, these methods typically require tuning parameters such as an observer gain andswitching threshold in the case of the online derivative estimator and basis, basis order, covariance, and time window inthe case of smoothing, to produce satisfactory results.In this note, we reformulate the derivative-based CL method (DCL) in terms of an integral, removing the need to estimate state derivatives. Other methods such as composite adaptive control also use integration-based terms to improveparameter convergence (see, eg, the works of Slotine and Li,11 Volyanskyy et al,12 and Pan et al13 ); however, they still requirePE to ensure exponential convergence. Recently, results in other works14-18 have shown convergence using an interval orfinite excitation condition, although they either require measurements of state derivatives (see, eg, the work of Pan andYu15 ), require determining the analytical Jacobian of the regressor (see, eg, the work of Pan et al14 ), or are developed in amodel reference adaptive control context,16,17,19-21 which essentially assume that desired trajectories are generated from anLTI system and may rely on a matching condition, rather than the general nonlinear systems considered here without suchassumptions. Some results22-27 have theoretical analogs to those presented here, and some use filtering techniques to avoiddata storage requirements, although we show how the parameter estimation is performed alongside control developmentin a Lyapunov analysis. In our method, the only additional tuning parameter beyond what is needed for gradient-basedadaptive control designs is the time window of integration, which is analogous to the smoothing buffer window that isalready required for smoothing-based techniques. Despite the reformulation, the stability results still hold (ie, parameterconvergence) and Monte Carlo simulation results suggest greater robustness to noise compared to DCL implementations.The technique is also applied to Euler-Lagrange (EL) systems to demonstrate the use of integral CL (ICL) for systems withunmatched uncertainties, similar to the very recent results in the work of Pan and Yu18 that uses interval excitation (IE).Compared to some other similar approaches applied to EL systems, our approach selectively collects data rather thanincorporating the entire history so as to consider the most informative data for parameter estimation, in contrast to, eg,the work of Roy et al.28 Furthermore, our approach can learn from multiple exciting intervals, some of which may not besufficiently exciting (ie, do not have a nonzero minimum eigenvalue) but, in aggregate, are sufficiently rich, in contrastto, eg, the work of Pan and Yu.18 To develop the robustness that single interval IE methods natively provide, ICL can beintegrated with a purging technique.102CO NTRO L O BJECTIVETo illustrate the ICL method, consider an example dynamic system modeled as.x(t) ๐‘“ (x(t), t) u(t),(1)where t [0, ), x [0, ) Rn are the measurable states, u [0, ) Rn is the control input, and ๐‘“ Rn [0, ) Rn represents the locally Lipschitz drift dynamics, with some unknown parameters. In the following development, as istypical in adaptive control, f is assumed to be linearly parameterized in the unknown parameters, ie,๐‘“ (x, t) Y (x, t)๐œƒ,(2)where Y Rn [0, ) Rn m is a regressor matrix and ๐œƒ Rm represents the constant, unknown system parameters.To quantify the state tracking and parameter estimation objective of the adaptive control problem, the tracking error andparameter estimate error are defined ase(t) x(t) xd (t)(3)ฬƒ ๐œƒ ๐œƒ(t),ฬ‚๐œƒ(t)(4)where xd [0, ) R is a known, continuously differentiable desired trajectory and ๐œƒฬ‚ [0, ) Rm is the parameterestimate. In the following, functional arguments will be omitted for notational brevity, eg, x (t) will be denoted as x, unlessnecessary for clarity.To achieve the control objective, the following controller is commonly used:n.u(t) xd Y (x, t)๐œƒฬ‚ Ke,(5)

PARIKH ET AL.1777where K Rn n is a positive definite constant control gain. Taking the time derivative of (3) and substituting for (1), (2),and (5) yield the closed-loop error dynamics.eฬ‡ Y (x, t)๐œƒ xd Y (x, t)๐œƒฬ‚ Ke xd Y (x, t)๐œƒฬƒ Ke.(6)The parameter estimation error dynamics are determined by taking the time derivative of (4), yielding.ฬƒ ๐œƒ.ฬ‚๐œƒ(t)(7)An ICL-based update law for the parameter estimate is designed as.ฬ‚ ฮ“Y (x, t)T e kCL ฮ“๐œƒ(t)N () iT x(ti ) x(ti ฮ”t) i i ๐œƒฬ‚ ,(8)i 1m mare constant, positive definite control gains, N Z is a positive constant that satisfieswhereโŒˆ kโŒ‰CL R and ฮ“ RmN n , ti [0, t] are time points between the initial time and the current time, i (ti ), i (ti ),t (t) max{t ฮ”t, 0}Y (x(๐œ), ๐œ) d๐œ,(9)t (t) max{t ฮ”t, 0}(10)u(๐œ)d๐œ.Furthermore, 0n m denotes an n m matrix of zeros, and ฮ”t R is a positive constant denoting the size of the windowof integration. The CL term (ie, the second term) in (8) represents saved data. The principal idea behind this design is toutilize recorded input-output data generated by the dynamics to further improve the parameter estimate. See the workof Chowdhary7 for a discussion on how to choose data points to record. In short, the data points should be selected to Nmaximize the minimum eigenvalue of i 1 iT i since the minimum eigenvalue bounds the rate of convergence of theparameter estimation errors, as shown in the subsequent stability analysis. To calculate (t) and (t), one would storethe values of Y and u over the interval [t ฮ”t, t], which would require โŒˆmnhbฮ”tโŒ‰and โŒˆnhbฮ”tโŒ‰ bytes, respectively, whereh is the control loop rate in cycles per second and b is the number of bytes per value (eg, 8 bytes per double precisionfloating point number). Often, these storage requirements are easily satisfied by even modern embedded systems withsomewhat limited memory.The ICL-based adaptive update law in (8) differs from traditional DCL update laws given in, eg, the works of.Chowdhary et al.6-8 Specifically, the state derivative, control, and regressor terms, ie, x, u, and Y, respectively, used in theaforementioned works6-8 are replaced with the integral of those terms over the time window [t ฮ”t, t].Substituting (2) into (1) and integrating yieldst t ฮ”t.tx(๐œ)d๐œ t ฮ”ttY (x, ๐œ)๐œƒd๐œ t ฮ”tu(๐œ)d๐œ, t ฮ”t. Using the Fundamental Theorem of Calculus and the definitions in (9) and (10),x(t) x(t ฮ”t) (t)๐œƒ (t),(11) t ฮ”t, where the fact that ๐œƒ is a constant was used to pull it outside the integral. Rearranging (11) and substitutinginto (8) yields.ฬ‚ ฮ“Y (x, t)T e kCL ฮ“๐œƒ(t)N ฬƒ t ฮ”t. iT i ๐œƒ,(12)i 1Note that (9) and (10) are piecewise continuous in time, the CL term in (8) is piecewise constant in time, and the simplifiedadaptive update law (12) is piecewise continuous in time. Hence, the right-hand side of (7) is piecewise continuous in time.3STABILITY A NA LYSISn mTo facilitate the following analysis, letrepresent a composite vector of the system states and parameter]T ) R[ T๐œ‚ T[0,. In addition, let ฮปmin {ยท} and ฮปmax {ยท} represent the minimum and maximumestimation errors, defined as ๐œ‚(t) e ๐œƒฬƒeigenvalues of {ยท}, respectively.

1778PARIKH ET AL.In the following stability analysis, time is partitioned into two phases. During the initial phase, insufficient data has beencollected to satisfy a richness condition on the history stack. In Theorem 1, it is shown that the controller and adaptiveupdate law are still sufficient for the system to remain bounded for all time despite the lack of data. After a finite periodof time, the system transitions to the second phase, where the history stack is sufficiently rich and the controller andadaptive update law are shown, in Theorem 2, to exponentially converge. To guarantee that the transition to the secondphase happens in finite time, and therefore, the overall system trajectories are ultimately bounded, we require the historystack be sufficiently rich after a finite period of time, as specified in the following assumption.Assumption 1. The system is sufficiently excited over a finite duration of time. Specifically, ฮป 0, T ฮ”t t { }NT ฮป.T, ฮปmin ii 1 iThe condition in (1) requires that the system be sufficiently excited, although it is weaker than the typical PE conditionsince excitation is only needed for a finite period of time. Specifically, PE requirest ฮ”t tY T (x(๐œ), ๐œ) Y (x(๐œ), ๐œ) d๐œ ๐›ผI 0, t 0,(13) whereas Assumption 1 only requires the system trajectories to be exciting up to time T (at which point, Ni 1 iT i is fullrank), after which the exciting data recorded during t [0, T ] is exploited{for all t }T. Another benefit of the devel NTopment in this paper is that the excitation condition is measurable (ie, ฮปmini 1 i i can be calculated), whereas inPE, ฮ”t is unknown, and hence, an uncountable number of integrals would need to be calculated at each of the uncountable number of time points, t, in order to verify PE. Assumption 1 is verified online by continually acquiring data (using,eg, the singular value maximization algorithmin the workof Chowdhary7 to ensure that the minimum eigenvalue of{} N NTTi 1 i i is always increasing) until ฮปmini 1 i i has reached a user selectable threshold. The threshold value isdirectly related to the exponential convergence rate of the system, as shown in the subsequent analysis. Since numericalintegration may result in truncation errors (eg, fourth-order Runge-Kutta methods have (h5 ) local truncation errors), thethreshold should also be selected sufficiently large to ensure that the excitation condition is satisfied beyond the boundsof integration uncertainty to mitigate misidentification due to noise and truncation errors.To encourage excitation of the system, a perturbation signal can be added to the desired trajectory. Notably, this perturbation signal (which distracts from the original state trajectory objective encoded in the original desired trajectory)would only need to be added to the system for a finite time before ensuring that sufficient data has been collected to learnthe parameters. In other words, during implementation, the system only needs excitation initially, and then, the originaldesired trajectory can be tracked. In contrast, adaptive methods relying on PE might require perturbations for all time toensure parameter estimate convergence, and hence, the original state trajectory objective may never be achieved.Theorem 1. For the system defined in (1) and (7), the controller and adaptive update law defined in (5) and (8) ensurebounded tracking and parameter estimation errors.Proof. Let V Rn m R be a candidate Lyapunov function defined as11 Tฬƒe e ๐œƒฬƒ T ฮ“ 1 ๐œƒ.(14)22Taking the derivative of V along the trajectories of (1), substituting the closed-loop error dynamics in (6) and the Nequivalent adaptive update law in (12), noting that i 1 iT i is positive semidefinite, and simplifying yieldsV(๐œ‚) .V eT Ke,which implies the system states remain bounded via theorem 8.4 in the work of Khalil.29 Furthermore, since{{ }}.๐›ฝ2V 0, V(๐œ‚(t)) V(๐œ‚(0)), and therefore, โ€–๐œ‚(t)โ€– and ๐›ฝ2 โ€–๐œ‚(0)โ€–, where ๐›ฝ1 12 min 1, ฮปmin ฮ“ 1๐›ฝ1{{ 1 }}1max 1, ฮปmax ฮ“.2Theorem 2. Under Assumption 1, the controller and adaptive update law defined in (5) and (8) ensure globallyexponential tracking of the system defined in (1) and (7) in the sense that( )๐›ฝ2(15)exp(ฮป1 T) โ€–๐œ‚(0)โ€– exp( ฮป1 t), t [0, ).โ€–๐œ‚(t)โ€– ๐›ฝ1

PARIKH ET AL.1779Proof. Let V Rn m R be a candidate Lyapunov function defined as1 T1ฬƒe e ๐œƒฬƒ T ฮ“ 1 ๐œƒ.22Taking the derivative of V along the trajectories of (1) during t [T, ), substituting the closed-loop error dynamicsin (6) and the equivalent adaptive update law in (12), and simplifying yieldsV(๐œ‚) .V eT Ke kCL ๐œƒฬƒ TFrom Assumption 1, ฮปmin.{ Ni 1 iT iN ฬƒ t [T, ). iT i ๐œƒ,i 1} 0, t [T, ), which implies that Ni 1 iT i is positive definite, andtherefore, V is upper bounded by a negative definite function of ๐œ‚. Invoking theorem 4.10 in the work of Khalil,29 eand ๐œƒฬƒ are globally exponentially stable, ie, t [T, ), ๐›ฝ2โ€–๐œ‚(T)โ€– exp ( ฮป1 (t T)) ,โ€–๐œ‚(t)โ€– ๐›ฝ1}{where ฮป1 2๐›ฝ1 min ฮปmin {K}, kCL ฮป . The composite state vector can be further upper bounded using the results of2Theorem 1, yielding (15).Remark 1. Using an appropriate data selection algorithm (eg, the singular value maximization algorithm in the work of Chowdhary7 ) ensures the minimum eigenvalue of Ni 1 iT i is always increasing, and therefore, the Lyapunovfunction (14) is a common Lyapunov function30 as data is continuously added to the history stack.4EXTENSION TO EL SYSTEMSThe ICL technique can also be applied to systems with unmatched uncertainties. In this section, the ICL method is appliedto EL systems.4.1Control developmentConsider EL dynamics of the form in chapter 2.3 in the work of Lewis et al31 and in chapter 9.3 in the work of Spong andVidyasagar32.ฬˆ Vm (q(t), q(t)) q(t) Fd q(t) G (q(t)) ๐œ(t),M (q(t)) q(t)(16).ฬˆ Rn represent position, velocity, and acceleration vectors, respectively, M Rn Rn n representswhere q(t), q(t), q(t)the inertial matrix, Vm Rn Rn Rn n represents centripetal-Coriolis effects, Fd Rn n represents frictional effects,G Rn Rn represents gravitational effects, and ๐œ(t) Rn denotes the control input. The system in (16) is assumed tohave the following properties (see chapter 2.3 in the work of Lewis et al31 ), which hold for a large class of physical systems.Property 1. The system in (16) can be linearly parameterized, ie, the left-hand side of (16) can be rewritten as. .ฬˆ ๐œƒ M(q)qฬˆ Vm (q, q) q Fd q G(q),Y1 (q, q, q)(17)where Y1 Rn Rn Rn Rn m denotes the regression matrix, and ๐œƒ Rm is a vector of uncertain parameters.Property 2. The inertia matrix is symmetric and positive definite, and satisfies the following inequalities:m1 โ€–๐œ‰โ€–2 ๐œ‰ T M(q)๐œ‰ m2 โ€–๐œ‰โ€–2 , ๐œ‰ Rn ,where m1 and m2 are known positive scalar constants, and โ€–ยทโ€– represents the Euclidean norm.Property 3. The inertia and centripetal-Coriolis matrices satisfy the following skew symmetric relation:( .).1M(q) Vm (q, q) ๐œ‰ 0, ๐œ‰ Rn ,๐œ‰T2.where M(q(t)) is the time derivative of the inertial matrix. Equivalently, M(q) 2Vm (q; q).

1780PARIKH ET AL.To quantify the tracking objective, the position tracking error, e(t) Rn , and the filtered tracking error, r(t) Rn , aredefined ase qd q(18)r eฬ‡ ๐›ผe,(19)where qd (t) Rn represents the desired trajectory, whose first and second time derivatives exist and are continuousฬƒ Rm , is again(ie, qd (t) C2 ). To quantify the parameter identification objective, the parameter estimation error, ๐œƒ(t)defined asฬƒ ๐œƒ ๐œƒ(t),ฬ‚๐œƒ(t)(20)ฬ‚ Rm represents the parameter estimate.where ๐œƒ(t)Taking the time derivative of (19), premultiplying by M(q), substituting in from (16), and adding and subtracting.Vm (q, q)r result in the following open-loop error dynamics:( .).(21)M(q)r Y2 q, q, qd , qd , qฬˆ d ๐œƒ Vm (q, q)r ๐œ,where Y2 Rn Rn Rn Rn Rn Rn m is defined based on the relation( .). .ฬ‡Y2 q, q, qd , qd , qฬˆ d ๐œƒ M(q)qฬˆ d Vm (q, q)(qd ๐›ผe) Fd q G(q) ๐›ผM(q)e.(22)To achieve the tracking objective, the controller is designed as๐œ Y2 ๐œƒฬ‚ e k1 r,(23)ฬˆthe update law can be formulated in terms of anwhere k1 R is a positive constant. To circumvent the need for q(t),integral, as.๐œƒฬ‚ ฮ“Y2T r k2 ฮ“N ()ฬ‚ iT i i ๐œƒ(t),(24)i 1where i (ti ), i (ti ), [0, ) Rn m , and [0, ) Rn are defined ast (ti ) max{t ฮ”t, 0}๐œ(๐œŽ)d๐œŽ,.t. (ti ) Y3 (q(t), q(t), q (t ฮ”t) ,

adaptive control, nonlinear systems, system identification, uncertain systems 1 INTRODUCTION Adaptive control methods provide a technique to achieve a control objective despite uncertainties in the system model. Adaptive estimates are developed through insights from a Lyapunov-based analysis as a means to yield a desired objec-tive.

Related Documents:

1. Merancang aturan integral tak tentu dari aturan turunan, 2. Menghitung integral tak tentu fungsi aljabar dan trigonometri, 3. Menjelaskan integral tentu sebagai luas daerah di bidang datar, 4. Menghitung integral tentu dengan menggunakan integral tak tentu, 5. Menghitung integral dengan rumus integral substitusi, 6.

Sybase Adaptive Server Enterprise 11.9.x-12.5. DOCUMENT ID: 39995-01-1250-01 LAST REVISED: May 2002 . Adaptive Server Enterprise, Adaptive Server Enterprise Monitor, Adaptive Server Enterprise Replication, Adaptive Server Everywhere, Adaptive Se

Concurrent coding features Concurrent Coding Dashboard is the launch padfor concurrent coders: Manage workflow Complete final coding Concurrent Coding Worklists Can be defined like CDI worklists Priority factors that support concurrent coding workflow ocase status, priority score, last coder, last reviewer Ability to add findings .

Concurrent Learning Adaptive Model Predictive Control 3 of these techniques is that they can handle harsh learning transients,guarantee learn-ing of unknown model parameters subject to conditions on the system trajectories, and guarantee system stability during the learning. It is natural therefore, to hy-

The current literature on learning an optimal safe linear pol-icy adopts an offline/non-adaptive learning approach, which does not improve the policies until the learning terminates (Dean et al.,2019b). To improve the control performance during learning, adaptive/online learning-based control al-gorithms should be designed. However, though adaptive

Adaptive Control, Self Tuning Regulator, System Identification, Neural Network, Neuro Control 1. Introduction The purpose of adaptive controllers is to adapt control law parameters of control law to the changes of the controlled system. Many types of adaptive controllers are known. In [1] the adaptive self-tuning LQ controller is described.

The Landscape of Adaptive Learning With both the academic and popular media providing lists of disruptive adaptive learning companies and in-depth profiles of adaptive learning in public school districts, there is now a mainstream effort to find out. [12,38,41] And there is a corresponding effort to verify if adaptive

Waves API 550 User Manual - 3 - 1.2 Product Overview . The Waves API 550 consists of the API 550A, a 3-Band parametric equalizer with 5 fixed cutoff points per band and the API 550B, a 4-Band parametric equalizer with 7 fixed cutoff points per band. Modeled on the late 1960โ€™s legend, the API 550A EQ delivers a sound that has been a hallmark of high end studios for decades. It provides .