Markov Decision processes Assume ● fi_中国高校课件下载中心

正在加载图片...

Markov Decision processes Assume ● finite set of states s ● set of actions4 at each discrete time agent observes state st E S and chooses action at EA o then receives immediate reward rt and state changes to St+1 Markov assumption: St+1= d(St, at)and t =r(St,at i.e., rt and st+1 depend only on current state and action functions d and r may be nondeterministic - functions o and r not necessarily known to agent

<<向上翻页向下翻页>>

点击下载：《机器学习》演示文稿（15）