What to learn We might try to have ag_中国高校课件下载中心

正在加载图片...

What to learn We might try to have agent learn the evaluation function Vm(which we write as V*) It could then do a lookahead search to choose best action from any state s because T"(s)=argmax[r( s, a)+?V*(8(s, a problem · This works well if agent knows6:S×A→S, andr:S×A4→犹 . But when it doesnt it cant choose actions this way

<<向上翻页向下翻页>>

点击下载：《机器学习》演示文稿（15）