Safe And Efficient Model-free Adaptive Control Via Bayesian Optimization

1y ago
2 Views
1 Downloads
2.06 MB
9 Pages
Last View : 1m ago
Last Download : 3m ago
Upload by : Mara Blakely
Transcription

Safe and Efficient Model-free Adaptive Control via Bayesian OptimizationChristopher König1, , Matteo Turchetta2, , John Lygeros3 , Alisa Rupenyan1,3 , Andreas Krause2Abstract— Adaptive control approaches yield highperformance controllers when a precise system model orsuitable parametrizations of the controller are available.Existing data-driven approaches for adaptive control mostlyaugment standard model-based methods with additionalinformation about uncertainties in the dynamics or aboutdisturbances. In this work, we propose a purely data-driven,model-free approach for adaptive control. Tuning low-levelcontrollers based solely on system data raises concerns on theunderlying algorithm safety and computational performance.Thus, our approach builds on G O OSE, an algorithm forsafe and sample-efficient Bayesian optimization. We introduceseveral computational and algorithmic modifications in G O OSEthat enable its practical use on a rotational motion system.We numerically demonstrate for several types of disturbancesthat our approach is sample efficient, outperforms constrainedBayesian optimization in terms of safety, and achievesthe performance optima computed by grid evaluation. Wefurther demonstrate the proposed adaptive control approachexperimentally on a rotational motion system.I. I NTRODUCTIONAdaptive control approaches are a desirable alternative torobust controllers in high-performance applications that dealwith disturbances and uncertainties in the plant dynamics.Learning uncertainties in the dynamics and adapting havebeen explored with classical control mechanisms such asModel Reference Adaptive Control (MRAC) [1], [2]. Gaussian processes (GP) have been also used to model the outputof a nonlinear system in a dual controller [3], while couplingthe states and inputs of the system in the covariance functionof the GP model. Learning the dynamics in an L1 -adaptivecontrol approach has been demonstrated in [4], [5].Instead of modeling or learning the dynamics, the systemcan be represented by its performance, directly measuredfrom data. Then, the low-level controller parameters canbe optimized to fulfill the desired performance criteria.This has been demonstrated for motion systems in [6]–[10]. Such model-free approaches, however, have not beenbrought to continuous adaptive control, largely because ofdifficulties in continuously maintaining stability and safetyin the presence of disturbances and system uncertainties,and because of the associated computational complexity.Recently, a sample-efficient extension for safe exploration inBayesian optimization has been proposed [11]. In this paper,This project has been finded by the Swiss Innovation Agency (Innosuisse), grant Nr. 46716, and by the Swiss National Science Foundation underNCCR Automation.123 Inspire AG, Zurich, SwitzerlandLearning & Adaptive Systems group, ETH Zurich, SwitzerlandAutomatic Control Laboratory, ETH Zurich, SwitzerlandThe authors contributed equally.we further optimize this algorithm to develop a model-freeadaptive control method for motion systems.Contribution. In this work, we make the following contributions: (1) we extend the G O OSE algorithm for policysearch to adaptive control problems; that is, to problemswhere constant tuning is required due to changes in environmental conditions. (2) We reduce G O OSE’s complexity sothat it can be effectively used for policy optimization beyondsimulations. (3) We show the effectiveness of our approach inextensive evaluations on a real and simulated rotational axisdrive, a crucial component in many industrial machines.A table of symbols can be found in Section IX.II. R ELATED WORKBayesian optimization. Bayesian optimization (BO) [12]denotes a class of sample-efficient, black-box optimizationalgorithms that have been used to address a wide range ofproblems, see [13] for a review. In particular, BO has beensuccessful in learning high-performance controllers for avariety of systems. For instance, [14] learns the parameters ofa discrete event controller for a bipedal robot with BO, while[15], trades off real-world and simulated control experimentsvia BO. In [16], variational autoencoders are combined withBO to learn to control an hexapod, while [17] uses multiobjective BO to learn robust controllers for a pendulum.Safety-aware BO. Optimization under unknown constraints naturally models the problem of learning in safetycritical conditions, where a priori unknown safety constraintsmust not be violated. In [18] and [19] safety-aware variantsof standard BO algorithms are presented. In [20]–[22], BO isused as a subroutine to solve the unconstrained optimizationof the augmented Lagrangian of the original problem. Whilethese methods return feasible solutions, they may performunsafe evaluations. In contrast, the S AFE OPT algorithm [23]guarantees safety at all times. It has been used to safelytune a quadrotor controller for position tracking [6], [24]. In[7], it has been integrated with particle swarm optimization(PSO) to learn high-dimensional controllers. Unfortunately,S AFE OPT may not be sample-efficient due to its explorationstrategy. To address this, many solutions have been proposed.For example, [25] does not actively expand the safe set,which may compromise the optimality of the algorithmbut works well for the application considered. Alternatively,S TAGE OPT [26] first expands the safe set and, subsequently,optimizes over it. Unfortunately, it cannot provide goodsolutions if stopped prematurely. G O OSE [11] addressesthis problem by using a separate optimization oracle andexpanding the safe set in a goal-oriented fashion only whennecessary to evaluate the inputs suggested by the oracle.

III. S YSTEM AND PROBLEM STATEMENTIn this section, we present the system of interest, itscontrol scheme, and the mathematical model we use forour numerical evaluations. Finally, we introduce the safetycritical adaptive control problem we aim to solve.A. System and controllerThe system of interest is a rotational axis drive, a positioning mechanism driven by a synchronous 3 phase permanentmagnet AC motor equipped with encoders for position andspeed tracking (see Fig. 1). Such systems are routinely usedParam.ValueUnitmbc1c2c3c4c50.019130.081.78e 30.02950.3728.99e 30.11kgm2kgm2 /sNmNm/radradNmradFig. 1: Rotational axis drive (left) and its parameters (right).as components in the semiconductor industry, in biomedicalengineering, and in photonics and solar technologies.We model the system as a combination of linear andnonlinear blocks, where the linear block is modeled as adamped single mass system, following [9]: 11 G(s) : [Gp (s), Gv (s)] ,, (1)ms2 bs ms bwhere Gp (s) and Gv (s) are the transfer functions respectively from torque to angular position and torque to angularvelocity, m is the moment of inertia, and b is the rotationaldamping coefficient due to the friction. The values of m andb are obtained via least squares fitting and shown in Fig. 1.Next, we introduce the model for the nonlinear part of thedynamics fc , which we subtract from the total torque signal,see Fig. 2. The nonlinear cogging effects due to interactionsbetween the permanent magnets of the rotor and the statorslots [27] are modelled using Pn a Fourier truncated expansionas fc (p) c1 c2 p k 1 c2k 2 sin 2kπc3 p c2k 3where p is the position, c1 is the average thrust torque, c2 isthe gradient of the curve, c3 is the largest dominant perioddescribed by the angular distance of a pair of magnets, n isthe number of modelled frequencies, and, c2k 2 and c2k 3are respectively the amplitudes and the phase shifts of thesinusoidal functions, for k 1, 2, . . . , n. The parametersc : [c1 , . . . , c2n 3 ] are estimated using least squares errorminimization between the modelled cogging torque signalsand the measured torque signal at constant velocity to cancelthe effects from linear dynamics. The estimates of theparameters are shown in Fig. 1. To model the noise of thesystem, zero mean white noise with 6.09e-3 Nm variance isadded to the torque input signal of the plant.The system is controlled by a three-level cascade controllershown in Figure 2. The outermost loop controls the positionwith a P-controller Cp (s) Kp , and the middle loop controlsthe velocity with a PI-controller Cv (s) Kv (1 T1i s ). Theinnermost loop controls the current of the rotational drive.It is well-tuned and treated as a part of the plant G(s).Feedforward structures are used to accelerate the responseof the system. Their gains are well-tuned and not modifiedduring the re-tuning procedure. However, in our experiments,we perturb them to demonstrate that our method can adapt tonew operating conditions by adjusting the tunable parametersof the ram.prefControllerKvffCp(s)Cv(s)Plantfc(p)pFr G(s)vAdaptive agentPerformance metrics, task parameters, safe inputsFig. 2: Scheme of the proposed BO-based adaptive control.B. Adaptive control approachIn this section, we present the adaptive control problem weaim to solve. In particular, our goal is to tune the parametersof the cascade controller introduced in Section III-A to maximize the tracking accuracy of the system, following [9]. LetX denote the space of admissible controller parameters, x [Kp , Kv , Ti ], and let f : X R be the objective measuringthe corresponding tracking accuracy. In particular, we definef as the position tracking error averagedPNover the trajectoryinduced by the controller f (x) N1 i 1 perri (x) , whereperristhedeviationfromthereferencepositionat samplingitime i. Crucially, f does not admit a closed-form expressioneven when the system dynamics are known. However, for agiven controller x X , the corresponding tracking accuracyf (x) can be obtained experimentally. Notice that f can beextended to include many performance metrics, to minimiseoscillations, or to reduce settling time, as in [8], [9].In practice, we cannot experiment with arbitrarycontrollers while optimizing f due to safety and performanceconcerns. Thus, we introduce two constraints that must besatisfied at all times. The first one is a safety constraint q1 (x)defined as the maximum of the fast Fourier transform (FFT)of the torque measurement in a fixed frequency window.The second one is a tracking performance constraintq2 (x) perr (x) defined as the infinity norm of theposition tracking error. Finally, in reality, the system maybe subject to sudden or slow disturbances such as a changeof load or a drift in the dynamics due to components wear.Therefore, our goal is to automatically tune the controllerparameters of our system to maximize tracking accuracy forvarying operating conditions that we cannot control, whilenever violating safety and quality constraints along the way.IV. BACKGROUNDGaussian processes. A Gaussian process (GP) [28] is adistribution over the space of functions commonly used innon-parametric Bayesian regression. It is fully described bya mean function µ : X R, which, w.l.o.g, we set to zerofor all inputs µ(x) 0, x X , and a kernel function

k : X X R. Given the data set D {(xi , yi }ti 1 ,where yi f (xi ) εi and εi N (0, σ 2 ) is zero-mean i.i.d.Gaussian noise, the posterior belief over the function f hasthe following mean, variance and covariance:2 1µt (x) k yt ,t (x)(Kt σ I)00kt (x, x ) k(x, x ) σt (x) kt (x, x),k t (x)(Kt(2)2 σ I) 10kt (x ),(3)(4)where kt (x) (k(x1 , x), . . . , k(xt , x)), Kt is the positivedefinite kernel matrix [k(x, x0 )]x,x0 Dt , and I Rt t denotes the identity matrix. In the following, the superscriptsf and q denote GPs on the objective and on the constraints.Multi-task BO. In multi-task BO, the objective dependson the extended input (x, τ ) X T , where x is thevariable we optimize over and τ is a task parameter setby the environment that influences the objective. To copewith this new dimension, multi-task BO adopts kernels ofthe form kmulti ((x, τ ), (x0 , τ 0 )) kτ (τ, τ 0 ) k(x, x0 ), where denotes the Kronecker product. This kernel decouples thecorrelations in objective values along the input dimensions,captured by k, from those across tasks, captured by kτ [29].G O OSE. G O OSE extends any standard BO algorithm toprovide high-probability safety guarantees in presence of apriori unknown safety constraints. It builds a Bayesian modelof the constraints from noisy evaluations based on GP regression. It uses this model to build estimates of two sets: the pessimistic safe set, which contains inputs that are safe, i.e., satisfy the constraints, with high probability and the optimisticsafe set that contains inputs that could potentially be safe. Ateach round, G O OSE communicates the optimistic safe set tothe BO algorithm, which returns the input it would evaluatewithin this set, denoted as x . If x is also in the pessimisticsafe set, G O OSE evaluates the corresponding objective. Otherwise, it evaluates the constraints at a sequence of provablysafe inputs, whose choice is based on a heuristic priorityfunction, that allow us to conclude that x either satisfies orviolates the constraints with high probability. In the first case,the corresponding objective value is observed. In the secondcase, x is removed from the optimistic safe set and the BOalgorithm is queried for a new suggestion. Compared to [23],[26] G O OSE achieves a higher sample efficiency [11] whilecompared to [18]–[22], it guarantees safety at all times withhigh probability, under regularity assumptions.G O OSE assumptions. To infer constraint and objectivevalues of inputs before evaluating them, G O OSE assumesthese functions belong to a class of well-behaved functions,i.e., functions with a bounded norm in some reproducingkernel Hilbert space (RKHS) [30]. Based on this assumption,we can build well-calibrated confidence intervals over them.Here, we present these intervals for the safety constraint,q (the construction for f is analogous). Let µqt (x) andσtq (x) denote the posterior mean and standard deviationof our belief over q(x) computed according to Eqs. (2)and (3). We recursively define these monotonicallyincreasing/decreasing lower/upper bounds for q(x):qqqltq (x) max(lt 1(x), µqt 1 (x) βt 1σt 1(x)) and uqt (x) qqmin(uqt 1 (x), µqt 1 (x) βt 1σt 1(x)). The authors of [31],[32] show that, for functions with bounded RKHS norm, anappropriate choice of βtq implies that ltq (x) q(x) uqt (x)for all t R and x X . However, in practice, a constantvalue of 3 is sufficient to achieve safety [6], [11], [24].To start collecting data safely, G O OSE requires knowledgeof a set of inputs that are known a priori to be safe,denoted as S0 . In our problem, it is easy to identify suchset by designing conservative controllers for simplistic firstprinciple models of the system under control.V. G O OSE FOR ADAPTIVE CONTROLIn adaptive control, quickly finding a safe, locally optimalcontroller in response to modified external conditions iscrucial. In its original formulation, G O OSE is not suitable tothis problem for several reasons: (i) it assumes knowledge ofthe Lipschitz constant of the constraint, which is unknown inpractice, (ii) it relies on a fine discretization of the domain,which is prohibitive in large domains, (iii) it explicitlycomputes the optimistic safe set, which is expensive, (iv) itdoes not account for external changes it cannot control. Inthis section, we address these problems step by step and wepresent the resulting algorithm.Algorithm 1: G O OSE for adaptive control1234567891011121314Input: Safe seed S0 , f GP(µf , k f ; θf ),q GP(µq , k q ; θq ), τ τ0 ;Grid resolution: x s.t. k q (x, x x)(σηq ) 2 0.95while machine is running doSt {x X : uqt (x, τ ) κ};Lt {x St : z / St , with d(x, z) x};Wt {x Lt : uqt (x, τ ) ltq (x, τ ) };x PSO(St , Wt );if f (x† (τ )) ltf (x , τ ) tol thenif uqt (x , τ ) κ then evaluate f (x ), q(x ),update τ ;elsewhile x Wt , s.t. g t (x, x ) 0 dox w arg minx Wt d(x , x) s.t. g t (x, x ) 6 0;Evaluate f (x w ), q(x w ), update τ , St , Lt , Wt ;elseset system to x† (τ ), update τTask parameter. In general, the dynamics of a controlledsystem may vary due to external changes. For example, ourrotational axis drive may be subject to different loads orthe system components may wear due to extended use. Asdynamics change, so do the optimal controllers. In this case,we must adapt to new regimes imposed by the environment.To this end, we extend G O OSE to the multi-task setting presented in Section IV by introducing a task parameter τ thatcaptures the exogenous conditions that influence the system’sdynamics, and by using the kernel introduced in Section IV.To guarantee safety, we assume that the initial safe seed S0contains at least one safe controller for each possible task.

Avg. tracking error 0.450.250.20.000.00.000.005000.20.41000 0.62500% motor power50000.875001.0Tolerable errorCBOGoOSE0.8150.25 0.050.500.250.75Max tracking error [deg]0.501.00It. to conv.f (x†grid )GoOSECBO35.51.25181.25(Kp† , Kv† , Ti† )(50, 0.10, 1) (50, 0.09, 1)# Violations05Max. violation(-,-)(6587, 0.128)Fig. 3: Comparison of G O OSE and CBO for 10 runs of the stationary control problem. On the left, we see the cost for each run andits mean and standard deviation. The center and right figures show the constraint values sampled by each method. G O OSE reaches thesame solution as CBO (table), albeit more slowly (left). However, CBO heavily violates the constraints (center-right).Algorithm 2: PSO123456789101112Input: Safe set St , Boundary Wt , # particles m;pi U(St ), vi x, i 1, · · · , m;for j 0 to J dofor i 0 to m docond uqt (pji , τ ) κ x Wt : g t (x, pji ) 0;cond cond ltf (pji , τ ) zij 1 ;if cond then zij ltf (pji , τ ) else zij zij 1 ;z j arg mini 1,.,m zij ;r1,2 U([0, 2]);vij 1 αj vij r1 (zij pji ) r2 (z j pji ), i;p̃j 1 pji vij , i;ireturn z JLipschitz constant. G O OSE uses the Lipschitz constantof the safety constraint Lq , which is not known in practice,to compute the pessimistic safe set and an optimistic upperbound on constraint values. For the pessimistic safe set,we adopt the solution suggested in [6] and use the upperbound of the confidence interval to compute it (see L. 3of Algorithm 1). While pessimism is crucial for safety,optimism is necessary for exploration. To this end, G O OSEcomputes an optimistic upper bound for the constraint valueat input z based on the lower bound of the confidence intervalat another input x as ltq (x, τ ) Lq d(x, z), where d(·, ·) is themetric that defines the Lipschitz continuity of q. Here, weapproximate this bound with ltq (x, τ ) kµ t (x, τ )k d(x, z),where µ (x,τ)isthemeanoftheposteriorbelief over thetgradient of the constraint induced by our belief over theconstraint which, due to properties of GPs, is also a GP.This is a local version of the approximation proposed in[33]. Based on this approximation, we want to determinewhether, for the current task τ , an optimistic observationof the constraint at controller x, ltq (x, τ ) would allow usto classify as safe a controller z despite an uncertaintydue to noisy observations of the constraint. To this end, weintroduce the optimistic, noisy expansion operator g t (x, z) I ltq (x, τ ) kµ t (x, τ )k d(x, z) κ, ,where I is the indicator function and κ is the upper limit ofthe constraint. For a safe x, g t (x, z) 0 determines that:(i) z can plausibly be safe and (ii) evaluating the constraintat x could include z in the safe set.Optimization and optimistic safe set. Normally,G O OSE explicitly computes the optimistic safe set anduses GP-LCB [32] to determine the next input where toevaluate the objective, x . In other words, G O OSE solvesff(x), where Sto is theσt 1x t arg minx Sto µft 1 (x) βt 1optimistic safe set at iteration t. However, this requires afine discretization of the domain X to represent Sto as finiteset of points in X , which does not scale to large domains.Moreover, the recursive computation of Sto is expensive andnot well suited to the fast responses required by adaptivecontrol. Similarly to [7], here we rely on particle swarmoptimization (PSO) [34] to solve this optimization problem,which checks that the particles belong to the one-stepoptimistic safe set as the optimization progresses and avoidscomputing it explicitly. We initialize m particles positioneduniformly at random within the discretized pessimistic safeset with grid resolution x, velocity x with randomsign (L. 2 of Algorithm 2) and fitness equal to the lowerbound of the objective ltf (·). If a particle belongs to theoptimistic safe set (L. 5) and its fitness improves (L. 6), weupdate its best position. This step lets the particle diffuseinto the optimistic safe set without computing it explicitly.Subsequently, we update the particles’ positions (L. 11) andvelocities (L. 10) based on the particles’ best position zijand overall best position z j , which is updated in L. 8.Algorithm. We are ready to present our variant of G O OSEfor adaptive control. We start by computing the pessimisticsafe set St (L. 3 of Algorithm 1) on a grid with lengthscaledependent resolution x, its boundary Lt (L. 4) and theuncertain points on its boundary Wt (L. 5), which are usedto determine whether controllers belong to the optimisticsafe set. Based on these, a new suggestion x is computed(L. 6). If its lower bound is close to the best observation forthe current task x† (τ ) arg min{(x0 ,τ 0 ,f (x0 )) D:τ 0 τ } f (x0 ),we stop (L. 7). Otherwise, if the suggestion is safe, weevaluate it and possibly update the task parameter (L. 8).Finally, if we are not sure it is safe, we evaluate all theexpanders x w in increasing order of distance from thesuggestion x , until either x is in the pessimistic safe setand can be evaluated or there are no expanders for x anda new query to PSO is made (L. 11 and 12). During thisinner loop the task parameter is constantly updated.VI. N UMERICAL RESULTSWe first apply Algorithm 1 to tune the controller in Fig. 2,simulating the system model in Section III in stationaryconditions. Later, we use our method for adaptive controlof instantaneous and slow-varying changes of the plant. In

5 10 342.4Min costτmCost2.22.031.82151.60 10 3450100150200IterationsMin costτt2503001.4300Cost 10262.4Safety valueτmLimit31.63 10250100150Iterations200Safety valueτt25030010000050100150Iterations200250300(a) Minimum and observed cost f3005010050 10 2200 3100150Iterations200250(b) Safety constraint q1 .300Task (τm )1.81 1.52 1.842.2It. to conv.f (x†grid (τm ))f (x† (τm 11303001.281.3460.0911.411.39500.0811.6300 442.41.8152Limit2.01.4LimitMax tracking errorτm21.840 10 22.25742.0200 6317100150Iterations200Max tracking ons200250300Task (τt )f (x†grid (τt ))f (x† (τt ))Kp†Kv†Ti†651.271.26500.101(c) Tracking error constraint q2 .Fig. 4: Cost (left), safety constraint (center) and performance constraint (right) for the simulated adaptive control experiments with suddenchange of the moment of inertia m (upper) and a slow change of the rotational damping b (lower). The thins lines show the values foreach experiment. The thick one shows the best cost found for the current task. The tables show the mean values over 10 repetitions forthese experiments. G O OSE quickly finds optimal solutions and is able to adapt to both kind of disturbances.Section IX, we present an ablation study that investigatesthe impact of the task parameter on these problems.The optimization ranges of the controller parameters areset to Kp [5, 50], Kv [0.1, 0.11] for the position andvelocity gains, and Ti [1, 10] for the time constant. Foreach task, G O OSE returns the controller corresponding tothe best observation, x† (τ ) (Kp† , Kv† , Ti† ). The cost f (x)is provided in [deg 10 3 ] units. We denote the ‘true‘ optimalcontroller computed via grid search as x†grid . Due to noise, noteven this controller can achieve zero tracking error, see thetable in Fig. 3. We use a zero mean prior and squared exponential kernel with automatic relevance determination, withlength scales for each dimension [lKp , lKv , lTi , lm , lKff , lb̃ ] [30, 0.03, 3, 0.5, 0.3, 5], identical for f , q1 and q2 for the numerical simulations and the experiments on the system. Thelikelihood variance is adjusted separately for each GP model.Control for stationary conditions. In this section, wecompare G O OSE to CBO for controller tuning [10] in termsof the cost and the safety and performance constraints introduced in Section III when tuning the plant under stationaryconditions. We benchmark these methods against exhaustivegrid-based evaluation using a grid with 5 11 10 points.For each algorithm, we run 10 independent experiments,which vary due to noise injected in the simulation. Eachexperiment starts from the safe controller [Kp , Kv , Ti ] [15, 0.05, 3]. The table in Fig. 3 shows the median numberof iterations needed to minimize the cost. Fig. 3 (left) showsthe convergence of both algorithms for each repetition. Whileboth algorithms converge to the optimum, G O OSE preventsconstraint violations for all iterations (see Fig. 3). In contrast,CBO violates the constraints in 27.8% of the iterations toconvergence, reaching far in the unsafe range beyond thesafety limit of acceptable vibrations in the system (Fig. 3,center), showing that additional safety-related measures arerequired. While the constraint violations incurred by CBOcan be limited in stationary conditions by restricting theoptimization range, this is not possible for adaptive control.Adaptive control for instantaneous changes. We showhow our method adapts to instantaneous change in the loadof the system. We modify the moment of inertia m andestimate this from the system data. We inform the algorithmof the operatingPN conditions through a task parameter,τm log 10( N1 1 FFT(v vff ) ), which is calculated fromthe velocity measurement v, using the feed-forward signalvff from position to velocity. This data-driven task parameterexploits the differences in the levels of noise in the velocitysignal corresponding to different values of the moment ofinertia m. As the velocity also depends on the controllerparameters, configurations in a range of 0.15 of thecurrent τm value are treated as the same task configuration.The algorithm is initialized with [Kp , Kv , Ti ] [15, 0.05, 3]as safe seed for all tasks. The moment of inertia of theplant m is switched every 100 iterations. The table in Fig. 4summarizes 10 repetitions of the experiment with a stoppingcriterion set to tol 0.002. Fig. 4 (top left) shows thesampled cost and the task parameters for one run, and therunning minimum for each task. We clearly see that G O OSEquickly finds a high performing controller for the initialoperating conditions (m 0.0191kgm2 ). Subsequently, itadapts Kp and Kv in few iterations to the new regime (m 0.0382kgm2 ) due to the capability of the GP model togeneralize across tasks (Fig. 4, top table). Finally, when thesystem goes back to the initial regime, G O OSE immediatelyfinds a high-performing controller and manages to marginallyimprove over it. In particular, the best cost f (x† (τm )) foundfor each condition coincides with the optimum obtained bygrid evaluation f (x†grid (τm )). Moreover, Fig. 4 (top centerleft and right) shows that the constraints are never violated.Adaptive control for gradual changes. We show how ourmethod adapts to slow changes in the dynamics. In particular,we let the rotation damping coefficient increase linearly witht), where b0 30.08kgm2 /s. In thistime: b(t) b0 (1 1000case, time is the task parameter and, therefore, we use the0temporal kernel k(t, t0 ) (1 t ) t t /2 [35] as task kernel

7 10 36Min costτK f f1.10Cost5437 102Safety valueτK f 50 10 33.050100150200IterationsMin costτb̃62.0050100150 10272.51.5508CostIterations200250300 10 23Limit1.10Task (τKf f )1.05It. to conv.50 49f (x† (τKf f )) 1.63 1.75Kp†37 43Kv†.08 .09Ti†1 101.0020.956100150Iterations2002500.900508Safety valueτb̃Limit64542420Max tracking errorτK f f1025041004150Iterations 10 2200250Max tracking errorτb̃3050100150Iterations2002503000Task (τb̃ ) 2.72 4.53 2.73It. to conv. 99 78 19f (x† (τb̃ )) 1.59 1.61 1.59Kp†32 30 37Kv†.09 .09 .1Ti†67742050100150Iterations20025030033 162.07 1.9850 50.09 .0998613471.750.09108Limit21.0 0.95 1.05 0.9 1.10(a) Minimum and observed cost f .(b) Safety constraint q1 .(c) Tracking error constraint q2 .Fig. 5: Cost (left), safety constraint (center), performance constraint (right) and summary (table) for the real-world adaptive controlexperiments with sudden change of the feed-forward gain Kff (upper) and rotational damping b̃ (lower). The thins lines show the valuesfor each experiment. The thick one shows the best cost found for the current task. G O OSE quickly finds optimal controllers for all theregimes (left) without violating the constraints (center-right) despite the location of the optimum keeps changing (table).with t 0.0001. This kernel increases the uncertainty inalready evaluated samples with time. To incorporate the driftof the dynamics, we evaluate the best sample x† in Algorithm1 with respect to the stopping criterion threshold tol 0.001in a moving window of the last 30 iterations. Learning isslower, due to the increased uncertainty of old data points.On average, the stopping criterion (L. 7 in Algorithm 1)is fulfilled in 74 out of 300 iterations. The optimumincreases, (Fig. 4, bottom left and table), corresponding toa change in the controller parameters. In the second halfof the experiment the cost decreases, and reaches optimummore oft

Safe and Efficient Model-free Adaptive Control via Bayesian Optimization . Value Unit m 0:0191 kgm2 b 30:08 kgm2 s c 1 1:78e 3 Nm c 2 0:0295 Nm rad c 3 0:372 rad c 4 8:99e 3 Nm c 5 0:11 rad Fig. 1: Rotational axis drive (left) and its parameters (right). as components in the semiconductor industry, in biomedical

Related Documents:

The degree of a polynomial is the largest power of xwith a non-zero coe cient, i.e. deg Xd i 0 a ix i! d if a d6 0 : If f(x) Pd i 0 a ixiof degree d, we say that fis monic if a d 1. The leading coe cient of f(x) is the coe cient of xd for d deg(f) and the constant coe cient is the coe cient of x0. For example, take R R.

De nition 3. A su cient statistic . T. (X) is called minimal if for any su cient statistic . T (X) there exists some function . r. such that . T. (X) r (T (X)). Thus, in some sense, the minimal su cient statistic gives us the greatest data

safe analysis is not included in this tutorial. Please see the fe-safe User Manual including fe-safe Tutorials for details, for instance: Tutorial 106: Using fe-safe with Abaqus .odb files . Start fe-safe /Rubber TM as described in the -safe feUser Manual. The Configure -safefe Project Directory window will be displayed:

KOBELCO and Rogers Machinery Company, Inc. deliver an ecologically friendly and energy effi cient compressor design. 2 KOBELCO KNW Series are designed, manufactured, assembled and tested to be the longest lasting and most energy effi cient oil-free compressors in the world "Class Zero Oil-Free Air" All models meet ISO 8573-1 Class 0

KOBELCO and Rogers Machinery Company, Inc. deliver an ecologically friendly and energy effi cient compressor design. 2 KOBELCO KNW Series are designed, manufactured, assembled and tested to be the longest lasting and most energy effi cient oil-free compressors in the world "Class Zero Oil-Free Air" All models meet ISO 8573-1 Class 0

KOBELCO and Rogers Machinery Company, Inc. deliver an ecologically friendly and energy effi cient compressor design. 2 KOBELCO KNW Series are designed, manufactured, assembled and tested to be the longest lasting and most energy effi cient oil-free compressors in the world "Class Zero Oil-Free Air" All models meet ISO 8573-1 Class 0

KOBELCO and Rogers Machinery Company, Inc. deliver an ecologically friendly and energy effi cient compressor design. 2 KOBELCO KNW Series are designed, manufactured, assembled and tested to be the longest lasting and most energy effi cient oil-free compressors in the world "Class Zero Oil-Free Air" All models meet ISO 8573-1 Class 0

two categories: to estimate permeability coe cient based on the soil classi cation and to estimate permeability coe cient based on the pore pressure dissipation test. e correct tip resistance , pore pressure ratio and sleeve friction were modi ed to determine the type of soil [ ] and (Olsen [ ]). e permeability coe cient was estimated