Topic #22 16.31 Feedback Control Deterministic lOR Optimal control and the riccati equation · Lagrange multipliers The Hamiltonian matrix and the symmetric root locus Factoids: for symmtric R au ru 2u'R aR=R Copyright 2001 by Jonathan How
Topic #22 16.31 Feedback Control Deterministic LQR • Optimal control and the Riccati equation • Lagrange multipliers • The Hamiltonian matrix and the symmetric root locus Factoids: for symmtric R ∂uTRu = 2uTR ∂u ∂Ru = R ∂u Copyright 2001 by Jonathan How. 1
Fal!2001 16.3122-1 Linear Quadratic Regulator(LQR) We have seen the solutions to the lqr problem using the symmetric root locus which defines the location of the closed-loop poles Linear full-state feedback control Would like to demonstrate from first principles that this is the optimal form of the control Deterministic Linear Quadratic Regulator Plant: i(t)=A(tac(t)+ Bu(tu(t), a(to)=ro (t)=C2(t)x(t) Cost LQR 2(t) R2(t)z(t)+u(t)Ru(t)u(t) dt+a(tt)Ptr a(tf) Where P 20, Rz(t)>0 and Ruu(t)>0 Define Rxx=CRnC2≥0 A(t)is a continuous function of time Bu(t), C2(t, Rz(t), Ruu(t) are piecewise continuous functions of time. and all are bounded Problem Statement: Find the input u(t)Vt Eto, te] to mini- mize JLQR
Fall 2001 16.31 22—1 Linear Quadratic Regulator (LQR) • We have seen the solutions to the LQR problem using the symmetric root locus which defines the location of the closed-loop poles. — Linear full-state feedback control. — Would like to demonstrate from first principles that this is the optimal form of the control. • Deterministic Linear Quadratic Regulator Plant: x˙(t) = A(t)x(t) + Bu(t)u(t), x(t0) = x0 z(t) = Cz(t)x(t) Cost: Z tf £ JLQR = zT (t)Rzz(t)z(t) + uT (t)Ruu(t)u(t) ¤ dt + x(tf )Ptfx(tf ) t0 — Where Ptf ≥ 0, Rzz(t) > 0 and Ruu(t) > 0 — Define Rxx = Cz TRzzCz ≥ 0 — A(t) is a continuous function of time. — Bu(t), Cz(t), Rzz(t), Ruu(t) are piecewise continuous functions of time, and all are bounded. • Problem Statement: Find the input u(t) ∀t ∈ [t0, tf ] to minimize JLQR
Fal!2001 16.3122-2 Note that this is the most general form of the LQR problem we rarely need this level of generality and often suppress the time dependence of the matrices Aircraft landing problem To optimize the cost, we follow the procedure of augmenting the constraints in the problem(the system dynamics)to the cost(inte- grand) to form the Hamiltonian H=2(2(01(6+()u()+X()(4a()+Bu(t) A(t ERnXI is called the Adjoint variable or Costate It is the Lagrange multiplier in the problem From Stengel (pg427), the necessary and sufficient conditions for optimality are that R×x(t)-A7(t) a(te)= pe a(tf) 3.a.=0→R1l+B2(t)=0.r=-RlB(t) Ou2=0(need to check that Ruu >0
Fall 2001 16.31 22—2 • Note that this is the most general form of the LQR problem — we rarely need this level of generality and often suppress the time dependence of the matrices. — Aircraft landing problem. • To optimize the cost, we follow the procedure of augmenting the constraints in the problem (the system dynamics) to the cost (integrand) to form the Hamiltonian: 1 ¢ H = 2 ¡ xT (t)Rxxx(t) + uT (t)Ruuu(t) + λT (t) (Ax(t) + Buu(t)) — λ(t) ∈ Rn×1 is called the Adjoint variable or Costate — It is the Lagrange multiplier in the problem. • From Stengel (pg427), the necessary and sufficient conditions for optimality are that: T 1. λ˙(t) = −∂H = −Rxxx(t) − AT λ(t) ∂x 2. λ(tf ) = Ptfx(tf ) 3. ∂H = 0 ⇒ Ruuu + Bu T λ(t)=0, so uopt = −R−1 ∂u uu Bu T λ(t) 4. ∂2 H ≥ 0 (need to check that Ruu ≥ 0) ∂u2
Fal!2001 16.31 Optimization -1 This control design problem is a constrained optimization, with the constraints being the dynamics of the system The standard way of handling the constraints in an optimization is to add them to the cost using a lagrange multiplier Results in an unconstrained optimization Example: min f(, y)=a2 +y2 subject to the constraint that c(x,y)=x+y+2=0 Figure 1: Optimization results Clearly the unconstrained minimum is at a=y=0
Fall 2001 16.31 Optimization-1 • This control design problem is a constrained optimization, with the constraints being the dynamics of the system. • The standard way of handling the constraints in an optimization is to add them to the cost using a Lagrange multiplier — Results in an unconstrained optimization. • Example: min f(x, y) = x2 + y2 subject to the constraint that c(x, y) = x + y + 2 = 0 2 1.5 1 0.5 0 −0.5 −1 −1.5 −2 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 x Figure 1: Optimization results y • Clearly the unconstrained minimum is at x = y = 0
Fal!2001 16.31 Optimization -2 To find the constrained minimum form the augmented cost function L,f(x,y)+λc(x,y)=x2+y2+入(x+y+2) Where X is the lagrange multiplier e Note that if the constraint is satisfied, then L= f The solution approach without constraints is to find the stationary point of f(a, y)(af ax=af ay=0) With constraints we find the stationary points of L OL alaL axya入 which gives OL 2x+入=0 a=20+X=0 OL +y+2=0 e This gives 3 equations in 3 unknowns, solve to fine The key point here is that due to the constraint, the selection of a and y during the minimization are not independent The Lagrange multiplier captures this dependency The lQr optimization follows the same path as this, but it is com- plicated by the fact that the cost involves an integration over time
Fall 2001 16.31 Optimization-2 • To find the constrained minimum, form the augmented cost function L , f(x, y) + λc(x, y) = x2 + y2 + λ(x + y + 2) — Where λ is the Lagrange multiplier • Note that if the constraint is satisfied, then L ≡ f • The solution approach without constraints is to find the stationary point of f(x, y) (∂f/∂x = ∂f/∂y = 0) — With constraints we find the stationary points of L ∂L ∂L ∂L = = = 0 ∂x ∂y ∂λ which gives ∂L ∂x = 2x + λ = 0 ∂L = 2y + λ = 0 ∂y ∂L = x + y + 2 = 0 ∂λ • This gives 3 equations in 3 unknowns, solve to find x? = y? = −1 • The key point here is that due to the constraint, the selection of x and y during the minimization are not independent — The Lagrange multiplier captures this dependency. • The LQR optimization follows the same path as this, but it is complicated by the fact that the cost involves an integration over time
Fal!2001 16.3122-3 Note that we now have i(t)=Ac(t)+ Buopt(t)=Ac(t)-Buruu BuA(t with a(to)=o So combine with equation for the adjoint variable 入(t)=-Rx(t)-A入(t)=-C2RnC2x(t)-AN(t to get A B.R-1BT a(t CI R,C 入(t which of course is the hamiltonian matrix again Note that the dynamics of a(t are coup oled, but a(t)Is known initially and X(t)is known at the terminal time, since X(tf) Pie a(t This is a two point boundary value problem that is very hard to solve in general However, in this case, we can introduce a new matrix variable P(t and show that 1. A(t)=P(t)a(t 2. It is relatively easy to find P(t)
Fall 2001 16.31 22—3 • Note that we now have: x˙(t) = Ax(t) + Buopt(t) = Ax(t) − BuR−1 uu Bu T λ(t) with x(t0) = x0 • So combine with equation for the adjoint variable λ˙(t) = −Rxxx(t) − AT λ(t) = −Cz TRzzCzx(t) − AT λ(t) to get: ∙ x˙(t) λ˙(t) ¸ = " A −BuR−1 uu Bu T −Cz TRzzCz −AT # ∙ x(t) λ(t) ¸ which of course is the Hamiltonian Matrix again. • Note that the dynamics of x(t) and λ(t) are coupled, but x(t) is known initially and λ(t) is known at the terminal time, since λ(tf ) = Ptfx(tf ) — This is a two point boundary value problem that is very hard to solve in general. • However, in this case, we can introduce a new matrix variable P(t) and show that: 1. λ(t) = P(t)x(t) 2. It is relatively easy to find P(t)
Fal!2001 16.3122-4 How proceed? 1. For the 2n system (t) A Burau B 入(t) CTR,C 入(t) define a transition matrix Fil(t1, to) F12t1, to F(t1, to) F21t1, to) F22t1,to and use this to relate a(t)to a (t) and X(tf) F1(,tF2(4+)1[x(t) A(」F21(t)h2(,t)A(t SO (t)=F1(t,tx(t)+F12(t,t+)入(t) Fu(t,ty)+ Fi2(t, t)Pira(tf) 2. Now find X(t) in terms of a(tf) ()=|F2(t,t)+Ft,t)P|a(+) 3. Eliminate a(tf)to get N(t)=F2(t,t)+F2(t,tP-F1(t+)+F12(t+)Pa2( P(t)x(t
Fall 2001 16.31 22—4 • How proceed? 1. For the 2n system " # ∙ ∙ x˙(t) A −BuR−1 uu Bu T x(t) λ˙(t) ¸ = −Cz TRzzCz −AT λ(t) ¸ define a transition matrix " # F11(t1, t0) F12(t1, t0) F(t1, t0) = F21(t1, t0) F22(t1, t0) and use this to relate x(t) to x(tf ) and λ(tf ) " # ∙ ∙ λ(t) ¸ = x(t) F11(t,tf ) F12(t,tf ) x(tf ) F21(t,tf ) F22(t,tf ) λ(tf ) ¸ so x(t) = F h 11(t,tf )x(tf ) + F12(t,tf ) i λ(tf ) = F11(t,tf ) + F12(t,tf )Ptf x(tf ) 2. Now find λ(t) in terms of x(tf ) h i λ(t) = F12(t,tf ) + F22(t,tf )Ptf x(tf ) 3. Eliminate x(tf ) to get: h i h i−1 λ(t) = F12(t,tf ) + F22(t,tf )Ptf F11(t,tf ) + F12(t,tf )Ptf x(t) , P(t)x(t)
Fal!2001 16.3122-5 4. Now, since A(t)=P(t)c(t),then (t)=P(t)(t)+P(ti(t) C2 RzCza(t)= A(t) P(t)c(t)=C RzC2a(t)+A X(t)+P( c(t) C2RuCzc(t)+A A(t)+P(t)(Ac(t)-Buruu BIX(t) (C RzC2+P(t)A)a(t)+(A-Pt Buru BnX(t (CZRzCz+ P(t))x(t)+(a-Pt Buru BnP(t)c(t) AP(t)+P(t)A+C2R2C2-P(t)BRuu Bi P(t)a(t) This must be true for arbitrary a(t), so P(t)must satisfy P(t=A P(t)+P(t)A+C2 zC:-P(t BuRuu BiP(t) Which is a matrix differential Riccati Equation The optimal value of P(t) is found by solving this equation back wards in time from tf with P(te)=Pt
Fall 2001 16.31 22—5 4. Now, since λ(t) = P(t)x(t), then λ˙(t) = P˙(t)x(t) + P(t)x˙(t) ⇒ − Cz TRzzCzx(t) − AT λ(t) = −P˙(t)x(t) = Cz TRzzCzx(t) + AT λ(t) + P(t)x˙(t) = Cz TRzzCzx(t) + AT λ(t) + P(t)(Ax(t) − BuR−1 uu Bu T λ(t)) = (Cz TRzzCz + P(t)A)x(t)+(AT − P(t)BuR−1 uu Bu T )λ(t) = (Cz TRzzCz + P(t)A)x(t)+(AT − P(t)BuR−1 uu Bu T )P(t)x(t) = £ ATP(t) + P(t)A + Cz TRzzCz − P(t)BuR−1 uu Bu TP(t) ¤ x(t) • This must be true for arbitrary x(t), so P(t) must satisfy −P˙(t) = ATP(t) + P(t)A + Cz TRzzCz − P(t)BuR−1 uu Bu TP(t) — Which is a matrix differential Riccati Equation. • The optimal value of P(t) is found by solving this equation backwards in time from tf with P(tf ) = Ptf
Fal!2001 16.3122-6 The control gains are then Wopt =-Ruu Bux(t Ruu BP(t)x(t=-k(tc(t) Where k(t), Ruu BiP(t Note that a(t) and A(t) together define the closed-loop dynamics for the system(and its adjoint), but we can eliminate A(t) from the solution by introducing P(t)which solves a Riccati Equation The optimal control inputs are in fact a linear full-state feedback control Note that normally we are interested in problems with to=0 and tf=oo, in which case we can just use the steady-state value of P that solves(assumes that A, Bu is stabilizable) AP+PA+C/RI C2-PBRmlBTP=0 which is the Algebraic Riccati Equation If we use the steady-state value of P, then K is constant
Fall 2001 16.31 22—6 • The control gains are then uopt = −R−1 uu Bu T λ(t) = −R−1 uu Bu TP(t)x(t) = −K(t)x(t) — Where K(t) , R−1 uu Bu TP(t) • Note that x(t) and λ(t) together define the closed-loop dynamics for the system (and its adjoint), but we can eliminate λ(t) from the solution by introducing P(t) which solves a Riccati Equation. • The optimal control inputs are in fact a linear full-state feedback control • Note that normally we are interested in problems with t0 = 0 and tf = ∞, in which case we can just use the steady-state value of P that solves (assumes that A, Bu is stabilizable) ATP + P A + Cz TRzzCz − P BuR−1 uu Bu TP = 0 which is the Algebraic Riccati Equation. — If we use the steady-state value of P, then K is constant
Fal!2001 16.3122-7 EXample: simple system with to=0 and tf= 10sec J=x2(10) o 8 =4o+[=o[801 (t)+ru(t)dt Compute gains using both time-varying P(t)and steady-state value Find state solution a(0)=11 using both sets of gains 1=1h=5 Figure 2: Set q=l, r=1,h=10, Kl 0.73
Fall 2001 16.31 22—7 • Example: simple system with t0 = 0 and tf = 10sec. ∙ ∙ 0 1 0 x˙ = 0 −1 ¸ x + 1 ¸ u " # Z 10 ∙ ∙ J = xT (10) 0 0 x(10) + xT (t) q 0 0 0 ¸ x(t) + ru2 (t) ¸ dt 0 h 0 • Compute gains using both time-varying P(t) and steady-state value. • Find state solution x(0) = [1 1]T using both sets of gains q = 1 r = 1 h = 5 1.4 K1 (t) K1 5 4.5 1.2 4 1 3.5 3 0.8 2.5 0.6 2 0.4 1.5 1 0.2 0.5 K2 (t) K2 x 1 x 2 0 0 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 Time (sec) Time (sec) Dynamic Gains Static Gains 1.4 1.4 1.2 x 1 x 2 1.2 1 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 0 −0.2 −0.2 −0.4 −0.4 −0.6 −0.6 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 Time (sec) Time (sec) States Gains States Gains Figure 2: Set q = 1, r = 1, h = 10, Klqr = [1 0.73]