ment learning problems[J]. Applied in_中国高校课件下载中心

点击下载：多智能体分层强化学习综述（国防科技大学：殷昌盛、杨若鹏、朱巍、邹小飞、李峰）

正在加载图片...

·654· 智能系统学报第15卷 ment learning problems[J].Applied intelligence,2019. ligence.San Francisco,USA,2017:1726-1734 49(12):4303-4318. [40]VEZHNEVETS A S.OSINDERO S.SCHAUL T.et al. [26]ZHAO Xingyu,DING Shifei,AN Yuexuan,et al.Applic- Feudal networks for hierarchical reinforcement learning[Cl ations of asynchronous deep reinforcement learning based Proceedings of 34th International Conference on Ma- on dynamic updating weights[J].Applied intelligence, chine Learning.Sydney,Australia,2017:3540-3549. 2019,492:581-591 [41]PONSEN MJ V.SPRONCK P.AHA D W.Automatic- [27]ZHAO Xingyu,DING Shifei,AN Yuexuan,et al.Asyn- ally acquiring domain knowledge for adaptive game Al chronous reinforcement learning algorithms for solving using evolutionary learning[C]//Conference on Innovat- discrete space path planning problems[J].Applied intelli- ive Applications of Artificial Intelligence.Pittsburgh, gence,2018,48(12:4889-4904. Pennsylvania,2005:1535-1540. [28]SUTTON R S.PRECUP D.SINGH S R.Between MDPs [42]WEBER B G,ONTANON S.Using automated replay an- and Semi-MDPs:a framework for temporal abstraction in notation for case-based planning in games[C]//18th Inter- reinforcement learning[J].Artificial intelligence,1999, national Conference on Case-based Reasoning.Aless- 112(1-2):181-211 andria,Italy,2010:15-24. [29]PRECUP D.SUTTON R S.Multi-time models for tem- [43]WEBER B G,MAWHORTER P,MATEAS M,et al.Re- porally abstract planning[C]//Proceedings of the 1997 active planning idioms for multi-scale game Al[C]//Con- Conference on Advances in Neural Information Pro- ference on Computational Intelligence and Games, cessing Systems 10.Cambridge,United States,1998: Maastricht.The Netherlands.2010:115-122 1050-1056. [44]SONG Y.LI Y.LI C.Initialization in reinforcement [30]PRECUP D.Temporal abstraction in reinforcement learn- learning for mobile robots path planning[J].Control the- ing.[D].Amherst:University of Massachusetts,USA, ory&applications,2012,2912):1623-1628. 2000. [45]LIU Chunyang,TAN Yingqing,LIU Changan,MA Ying- [31]TANG Zhentao,ZHAO Dongbin,ZHU Yuanheng.Rein- wei.Application of multi-Agent reinforcement learning in forcement learning for build-order production in Star- robot soccer[J].Acta electronica sinica,2010,38(8): Craft II [C]//8th International Conference on Information 1958-1962. Science and Technology.Istanbul,Turkey.2018. [46]DUAN Yong,CUI Baoxia,XU Xinhe.Multi-agent rein- [32]PARR R.Hierarchical control and learning for markov forcement learning and its application role assignment of decision processes[D].Berkeley:University of California, robot soccer[J].Control theory applications,2009, 1998 26(4):371-376 [33]KULKARNI T D,NARASIMHAN K R,SAEEDI A,et [47]SYNNAEVE G,BESSIERE P.A bayesian model for al.Hierarchical deep reinforcement learning:integrating RTS units control applied to starcraft[J].IEEE transac- temporal abstraction and intrinsic motivation[EB/OL]. tions on computational intelligence and AI in games, [2016-4-201.htps:/∥arxiv..org/abs/1604.06057 2011.3(1):83-86. [34]DIETTERICH T G.Hierarchical reinforcement learning [48]SURDU JR,KITTKA K.Deep green:commander's tool with the MAXQ value function decomposition[J].Journ- for COA's concept[C]//Computing,Communications and al of artificial intelligence research,2000,13:227-303. Control Technologies 2008.Orlando,Florida.USA.2008 [35]MENACHE I,MARMOR S,SHIMKIN N.Q-Cut:dy- [49]ERNEST N,CARROLL D.SCHUMACHER C,et al. namic discovery of sub-goals in reinforcement learn- Genetic fuzzy based artificial intelligence for unmanned ing[J].Lecture notes in computer science 2430.2002: combat aerial vehicle control in simulated air combat mis- 295-306. sions[J].Journal of denfense management,2016,6(1): [36]DRUNNOND C.Accelerating reinforcement learning by 1-7. composing solutions of automatically identified [50]DERESZYNSKI E,HOSTETLER J,FERN A,et al. subtasks[J].Journal of artificial intelligence research, Learning probabilistic behavior models in real-time 2002,16:59-104 strategy games[C]//Proc of the 7th AAAI Conference on [37]HENGST B.Discovering hierarchy in reinforcement Artificial Intelligence and Interactive Digital Entertain- learning[D].Sydney:University of New South Wales, ment,Stanford,USA,2011:20-25. Australia,2003. [51]胡桐清，陈亮.军事智能辅助决策的理论与实践).军 [38]UTHER W T B.Tree based hierarchical reinforcement 事系统工程，1995(C1):3-10. learning[D].Pittsburgh:Carnegie Mellon University, HU Tongqing,CHEN Liang.Theory and practice of mil- USA,2002 itary intelligence assistant decision[J].Military opera- [39]PIERRE B.JEAN H.The option-critic architecture[C]// tions research and systems engineering,1995(C1):3-10. Proceedings of 31th AAAI Conference on Artifical Intel- [52]朱丰，胡晓峰.基于深度学习的战场态势评估综述与研ment learning problems[J]. Applied intelligence, 2019, 49(12): 4303–4318. ZHAO Xingyu, DING Shifei, AN Yuexuan, et al. Applications of asynchronous deep reinforcement learning based on dynamic updating weights[J]. Applied intelligence, 2019, 49(2): 581–591. [26] ZHAO Xingyu, DING Shifei, AN Yuexuan, et al. Asynchronous reinforcement learning algorithms for solving discrete space path planning problems[J]. Applied intelligence, 2018, 48(12): 4889–4904. [27] SUTTON R S, PRECUP D, SINGH S R. Between MDPs and Semi-MDPs: a framework for temporal abstraction in reinforcement learning[J]. Artificial intelligence, 1999, 112(1-2): 181–211. [28] PRECUP D, SUTTON R S. Multi-time models for temporally abstract planning[C]// Proceedings of the 1997 Conference on Advances in Neural Information Processing Systems 10. Cambridge, United States, 1998: 1050−1056. [29] PRECUP D. Temporal abstraction in reinforcement learning. [D]. Amherst: University of Massachusetts, USA, 2000. [30] TANG Zhentao, ZHAO Dongbin, ZHU Yuanheng. Reinforcement learning for build-order production in StarCraft II [C]//8th International Conference on Information Science and Technology. Istanbul, Turkey. 2018. [31] PARR R. Hierarchical control and learning for markov decision processes[D]. Berkeley: University of California, 1998. [32] KULKARNI T D, NARASIMHAN K R, SAEEDI A, et al. Hierarchical deep reinforcement learning: integrating temporal abstraction and intrinsic motivation[EB/OL]. [2016-4-20]. https://arxiv.org/abs/1604.06057. [33] DIETTERICH T G. Hierarchical reinforcement learning with the MAXQ value function decomposition[J]. Journal of artificial intelligence research, 2000, 13: 227–303. [34] MENACHE I, MARMOR S, SHIMKIN N. Q-Cut: dynamic discovery of sub-goals in reinforcement learning[J]. Lecture notes in computer science 2430.2002: 295−306. [35] DRUNNOND C. Accelerating reinforcement learning by composing solutions of automatically identified subtasks[J]. Journal of artificial intelligence research, 2002, 16: 59–104. [36] HENGST B. Discovering hierarchy in reinforcement learning[D]. Sydney: University of New South Wales, Australia, 2003. [37] UTHER W T B. Tree based hierarchical reinforcement learning[D]. Pittsburgh: Carnegie Mellon University, USA, 2002. [38] PIERRE B, JEAN H. The option-critic architecture[C]// Proceedings of 31th AAAI Conference on Artifical Intel- [39] ligence. San Francisco, USA, 2017: 1726−1734. VEZHNEVETS A S, OSINDERO S, SCHAUL T, et al. Feudal networks for hierarchical reinforcement learning[C]// Proceedings of 34th International Conference on Machine Learning. Sydney, Australia, 2017: 3540−3549. [40] PONSEN M J V, SPRONCK P, AHA D W. Automatically acquiring domain knowledge for adaptive game AI using evolutionary learning[C]//Conference on Innovative Applications of Artificial Intelligence. Pittsburgh, Pennsylvania, 2005: 1535−1540. [41] WEBER B G, ONTANON S. Using automated replay annotation for case-based planning in games[C]//18th International Conference on Case-based Reasoning. Alessandria, Italy, 2010: 15−24. [42] WEBER B G, MAWHORTER P, MATEAS M, et al. Reactive planning idioms for multi-scale game AI[C]// Conference on Computational Intelligence and Games, Maastricht, The Netherlands, 2010: 115−122. [43] SONG Y, LI Y, LI C. Initialization in reinforcement learning for mobile robots path planning[J]. Control theory & applications, 2012, 29(12): 1623–1628. [44] LIU Chunyang, TAN Yingqing, LIU Changan, MA Yingwei. Application of multi-Agent reinforcement learning in robot soccer[J]. Acta electronica sinica, 2010, 38(8): 1958–1962. [45] DUAN Yong, CUI Baoxia, XU Xinhe. Multi-agent reinforcement learning and its application role assignment of robot soccer[J]. Control theory & app1ications, 2009, 26(4): 371–376. [46] SYNNAEVE G, BESSIERE P. A bayesian model for RTS units control applied to starcraft[J]. IEEE transactions on computational intelligence and AI in games, 2011, 3(1): 83–86. [47] SURDU J R, KITTKA K. Deep green: commander’s tool for COA’s concept[C]//Computing, Communications and Control Technologies 2008, Orlando, Florida, USA, 2008. [48] ERNEST N, CARROLL D, SCHUMACHER C, et al. Genetic fuzzy based artificial intelligence for unmanned combat aerial vehicle control in simulated air combat missions[J]. Journal of denfense management, 2016, 6(1): 1–7. [49] DERESZYNSKI E, HOSTETLER J, FERN A, et al. Learning probabilistic behavior models in real-time strategy games[C]//Proc of the 7th AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, Stanford, USA, 2011: 20−25. [50] 胡桐清, 陈亮. 军事智能辅助决策的理论与实践 [J]. 军事系统工程, 1995(C1): 3–10. HU Tongqing, CHEN Liang. Theory and practice of military intelligence assistant decision[J]. Military operations research and systems engineering, 1995(C1): 3–10. [51] [52] 朱丰, 胡晓峰. 基于深度学习的战场态势评估综述与研 ·654· 智能系统学报第 15 卷

<<向上翻页向下翻页>>

点击下载：多智能体分层强化学习综述（国防科技大学：殷昌盛、杨若鹏、朱巍、邹小飞、李峰）