Training Rule to Learn Q Note Q and v_中国高校课件下载中心

正在加载图片...

Training Rule to Learn Q Note Q and v closely related: V(s)=max Q(s, a) Which allows us to write Q recursively as Q(St, at)=r(St, at)+V(S(st, at)) r(St, at)+y max Q(st+1, a') Nice! Let Q denote learners current approximation to Q. Consider training rule Q(s,a)←r+maxQ(s,a’) where s is the state resulting from applying action a in state s

<<向上翻页向下翻页>>

点击下载：《机器学习》演示文稿（15）