相关文档

麻省理工学院:《自制决策制造原则》英文版 Planning to Maximize Reward: Markov Decision processes

How Might a mouse search a Maze for Cheese? heese · State Space Search? As a Constraint Satisfaction Problem? Goal-directed Planning As a rule or production System? What is missing? Ideas in this lecture Objective is to accumulate rewards rather than goal states Task is to generate policies for how to act in all situations rather than a plan for a single starting situation
团购合买资源类别:文库,文档格式:PDF,文档页数:25,文件大小:187.97KB
点击进入文档下载页(PDF格式)
共25页,试读已结束,阅读完整版请下载
点击下载(PDF格式)