正在加载图片...
Combining these displays one is formally led to(7). A proof of(7)when V is sufficiently smooth requires a careful derivation of two inequalities which combine to give(7). Below we will prove that v is a viscosity solution of (7); in fact, the unique one satisfying the terminal condition( 8) Verification. Let V(t, r)be a CI solution of(7), 8). Let u(Eui,t be any control Then using(7) d v(t, a(t))=av(t, a(t)+vv(t, r(t)i(t) av(t, r(t))+Vv(t, c(t))f(a(t),u(t) ≥-L(x(t),u(t) Integrating, we get V(ti, a(t1))-v(to, To)2-/L(a(t),u(t)) v(to, o)<M L(a(t), u(t)dt+V(t1,a(ti) L((t), u(t)dt +v(a(t1)) using( 8). This shows that V(to, to)<V(to, To)(V is the value function defined by (5)) Now this same calculation for the control u()=u*(EUto,, t satisfying ) E argmin,{V(,x(),((0)+L((0y for tEt1, ti, where x() is the corresponding state trajectory, gives V(to, o)=/L(a*(t),u(t)dt +v(ar"(t1) showing that in fact u* is optimal and V(to, ro)=v(to, to). Indeed we have V=V in to, ti] x R by this arguement, and so we have shown that any smooth solution to(7) (8)must equal the value function-this is a uniqueness result. Unfortuneatly, in general there may be no such smooth solutions Optimal feedback. The above calculations suggest how one might obtain an optimal feedback controller. To simplify a bit, suppose that U=R, f(, u)=f(a)+g(r)u, L(, u)=e()+ alul Then evaluating the infimum in( 9) gives g(a)'A and H(c, A)=Af(r)-jAg(a)g(c)X+e(r)Combining these displays one is formally led to (7). A proof of (7) when V is sufficiently smooth requires a careful derivation of two inequalities which combine to give (7). Below we will prove that V is a viscosity solution of (7); in fact, the unique one satisfying the terminal condition (8). Verification. Let V˜ (t, x) be a C 1 solution of (7), (8). Let u(·) ∈ Ut1,t1 be any control. Then using (7) d dtV˜ (t, x(t)) = ∂ ∂tV˜ (t, x(t)) + ∇V˜ (t, x(t)) ˙x(t) = ∂ ∂tV˜ (t, x(t)) + ∇V˜ (t, x(t))f(x(t), u(t)) ≥ −L(x(t), u(t)) Integrating, we get V˜ (t1, x(t1)) − V˜ (t0, x0) ≥ − Z t1 t0 L(x(t), u(t))dt or V˜ (t0, x0) ≤ R t1 t0 L(x(t), u(t))dt + V˜ (t1, x(t1)) = R t1 t0 L(x(t), u(t))dt + ψ(x(t1)) using (8). This shows that V˜ (t0, x0) ≤ V (t0, x0) (V is the value function defined by (5)). Now this same calculation for the control u(·) = u ∗ (·) ∈ Ut0,t1 satisfying u ∗ (t) ∈ argmin v∈U n ∇xV˜ (t, x∗ (t)) · f(x ∗ (t), v) + L(x ∗ (t), v) o , (10) for t ∈ [t1, t1], where x ∗ (·) is the corresponding state trajectory, gives V˜ (t0, x0) = Z t1 t0 L(x ∗ (t), u∗ (t))dt + ψ(x ∗ (t1)) showing that in fact u ∗ is optimal and V˜ (t0, x0) = V (t0, x0). Indeed we have V˜ = V in [t0, t1] × Rn by this arguement, and so we have shown that any smooth solution to (7), (8) must equal the value function - this is a uniqueness result. Unfortuneatly, in general there may be no such smooth solutions. Optimal feedback. The above calculations suggest how one might obtain an optimal feedback controller. To simplify a bit, suppose that U = Rm, f(x, u) = f(x) + g(x)u, L(x, u) = `(x) + 1 2 |u| 2 . Then evaluating the infimum in (9) gives u ∗ = −g(x) 0λ 0 and H(x, λ) = λf(x) − 1 2 λg(x)g(x) 0λ 0 + `(x). 7
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有