Q Function Define new function very s_中国高校课件下载中心

正在加载图片...

Q Function Define new function very similar to V Q(s,a)=r(s,a)+yV(8(s, a If agent learns Q, it can choose optimal action even without knowing 8! T(s)=argmax[r(s,a)+yV(8(s, a)) 丌*(s)= argmax Q(s,a) Q is the evaluation function the agent will learn

<<向上翻页向下翻页>>

点击下载：《机器学习》演示文稿（15）