正在加载图片...
Hence the HjB equation can be written as 0 VVf-5vvgg'VV (11) with optimal feedback controller (t, r)=g(a)'VV(t, r) This means that the optimal control u* (Eu is given by (t)=a(t,x(t),to≤t≤t1 Of course, this makes sense only when V is sufficiently smooth The equation(11)is sometimes refered to as a nonlinear riccati equation LQR. Take U=R",f(x,u)=Ax+Bu,L(x,u)=是2+u2,v(x)=号①x As a trial solution of (11)we use V(t, r)=5a'P(t where P(t)>0(symmetric)is to be determined. Now ot(t, x)=Jr'P() c, and VV(t, a)=xP(t) Plugging these into(11) gives 号xP(t)x+xP(Ax-xP(t)B'P(t)x+xx=0 Since this holds for all r Rn we must have P(t)+AP(t)+P(t)A-P(t)BB'P(t)+I At time t=t1 we have v(t1,x)=是x业x, and so P(t1)=业 (14) Therefore if there exists a CI solution P(t)to the Riccati differential equation(13)on to, ti] with terminal condition(14)we obtain a smooth solution 2r'P(t) r to(7), (8),and s argued above the value function for the lQr problem is given by (,x)=是xP(t) (15) The optimal feedback controller is given by (t,x)=-BP() This gives the optimal control u*(Eu (1)=-B'P(t)x(t),to≤t≤t1Hence the HJB equation can be written as ∂ ∂tV + ∇V f − 1 2∇V gg0∇V 0 + ` = 0 (11) with optimal feedback controller u ∗ (t, x) = −g(x) 0∇V (t, x) 0 . (12) This means that the optimal control u ∗ (·) ∈ U is given by u ∗ (t) = u ∗ (t, x∗ (t)), t0 ≤ t ≤ t1. Of course, this makes sense only when V is sufficiently smooth. The equation (11) is sometimes refered to as a nonlinear Riccati equation. LQR. Take U = Rm, f(x, u) = Ax + Bu, L(x, u) = 1 2 |x| 2 + 1 2 |u| 2 , ψ(x) = 1 2 x 0Ψx. As a trial solution of (11) we use V˜ (t, x) = 1 2 x 0P(t)x, where P(t) ≥ 0 (symmetric) is to be determined. Now ∂ ∂tV˜ (t, x) = 1 2 x 0P˙(t)x, and ∇V (t, x) = x 0P(t). Plugging these into (11) gives 1 2 x 0P˙(t)x + x 0P(t)Ax − 1 2 x 0P(t)BB0P(t)x + 1 2 x 0x = 0. Since this holds for all x ∈ Rn we must have P˙(t) + A 0P(t) + P(t)A − P(t)BB0P(t) + I = 0. (13) At time t = t1 we have V˜ (t1, x) = 1 2 x 0Ψx, and so P(t1) = Ψ. (14) Therefore if there exists a C 1 solution P(t) to the Riccati differential equation (13) on [t0, t1] with terminal condition (14) we obtain a smooth solution 1 2 x 0P(t)x to (7), (8), and as argued above the value function for the LQR problem is given by V (t, x) = 1 2 x 0P(t)x. (15) The optimal feedback controller is given by u ∗ (t, x) = −B 0P(t)x. (16) This gives the optimal control u ∗ (·) ∈ U: u ∗ (t) = −B 0P(t)x ∗ (t), t0 ≤ t ≤ t1. (17) 8
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有