DOI: 10.11992/tis.201803043 一种基于经验的德州

正在加载图片...

第15卷第3期智能系统学报 Vol.15 No.3 2020年5月 CAAI Transactions on Intelligent Systems May 2020 D0:10.11992/tis.201803043 一种基于经验的德州扑克博弈系统架构高强，徐心和2，王昊3，白国力3，曹瑞珉 (1.沈阳大学辽宁省装备制造综合自动化重点实验室，辽宁沈阳110044,2.东北大学信息科学与工程学院，过宁沈阳110819：3.东北大学机械工程与自动化学院，辽宁沈阳110819) 摘要：为了利用历史经验知识提高德州扑克博弈水平，提出一种二人赌注无上限的德州扑克博弈系统架构：对于知识库模块，利用海量历史牌局训练得到基于CN的深度学习网络模型并构建了一个专家经验库：在系统的搜索模块中，构建了一种分阶段的德州扑克博弈树，利用专家经验和历史经验引导德州扑克博弈树的展开；对于系统的估值核心模块，构建了一种基于哈希技术的牌型对照表，以提高系统判定胜负的效率。实验结果表明本文提出的博弈系统架构具有更高的对弈水平。关键词：二人赌注无上限德州扑克；计算机博弈；非完全信息动态博弈：博弈树：深度学习；专家库：哈希表：博弈策略中图分类号：TP301.5文献标志码：A文章编号：1673-4785(2020)03-0468-07 中文引用格式：高强，徐心和，王吴，等.一种基于经验的德州扑克博奔系统架构.智能系统学报，2020,15(3)：468-474. 英文引用格式：GAO Qiang,XU Xinhe,WANG Hao,.et al.System architecture of Texas Hold'em based on experienceJ.CAAl transactions on intelligent systems,2020,15(3):468-474. System architecture of Texas Hold'em based on experience GAO Qiang XU Xinhe',WANG Hao',BAI Guoli,CAO Ruimin' (1.Key Laboratory of Manufacturing Industrial Integrated Automation,Shenyang University,Shenyang 110044,China;2.College of Information Science and Engineering,Northeastern University,Shenyang 110819,China;3.School of Mechanical Engineering and Automation,Northeastern University,Shenyang 110819,China) Abstract:To improve the level of Texas Hold'em through historical experience,this paper proposes a system architec- ture of heads-up no-limit Texas Hold'em for the knowledge base module.Mass historic games are used to train the deep learning network based on convolutional neural network,and an expert database is constructed for the search module of the system.Texas Hold'em structured game tree is developed and extended,and it is applied in terms of the expertise and historical experience to the core module for evaluation.A hand-ranking hash-based table is built to reduce the time required to evaluate hand rankings.The experimental result shows a higher playing level for the proposed system archi- tecture. Keywords:Heads-up no-limit Texas Hold'em;computer games;dynamic game with imperfect information; game tree;deep learning;expert database;Hash table;game strategy 德州扑克属于一种典型且复杂的非完全信息克博奔系统Polaris首次战胜了职业扑克选手。动态博弈问题)，它是近年来计算机博弈领域的 2015年1月，加拿大阿尔伯特大学在Science期刊学者们重点研究的热点问题。2006年，加拿大阿上发表了一篇关于德州扑克博弈问题最新研究成果的文章)，该研究小组开发了两人参与的有赌尔伯特大学(University of Alberta)作为主办方举注上限的德州扑克博弈系统，并得到了该博弈问办了首届国际计算机扑克大赛)；2007年，德州扑题的理论解。但是二人赌注无上限的德州扑克问收稿日期：2018-03-26. 基金项目：辽宁省自然科学基金项目(20180550146.20170520386. 题，由于具有更高的复杂度（文献[6]证明了此类通信作者：高强.E-mail:tommy_.06@163.com. 问题属于NP-hard问题)，一直没有实现求解。DOI: 10.11992/tis.201803043 一种基于经验的德州扑克博弈系统架构高强1 ，徐心和2 ，王昊3 ，白国力3 ，曹瑞珉3 （1. 沈阳大学辽宁省装备制造综合自动化重点实验室，辽宁沈阳 110044; 2. 东北大学信息科学与工程学院，辽宁沈阳 110819; 3. 东北大学机械工程与自动化学院，辽宁沈阳 110819）摘要：为了利用历史经验知识提高德州扑克博弈水平，提出一种二人赌注无上限的德州扑克博弈系统架构：对于知识库模块，利用海量历史牌局训练得到基于 CNN 的深度学习网络模型并构建了一个专家经验库；在系统的搜索模块中，构建了一种分阶段的德州扑克博弈树，利用专家经验和历史经验引导德州扑克博弈树的展开；对于系统的估值核心模块，构建了一种基于哈希技术的牌型对照表，以提高系统判定胜负的效率。实验结果表明本文提出的博弈系统架构具有更高的对弈水平。关键词：二人赌注无上限德州扑克；计算机博弈；非完全信息动态博弈；博弈树；深度学习；专家库；哈希表；博弈策略中图分类号：TP301.5 文献标志码：A 文章编号：1673−4785(2020)03−0468−07 中文引用格式：高强, 徐心和, 王昊, 等. 一种基于经验的德州扑克博弈系统架构 [J]. 智能系统学报, 2020, 15(3): 468–474. 英文引用格式：GAO Qiang, XU Xinhe, WANG Hao, et al. System architecture of Texas Hold’em based on experience[J]. CAAI transactions on intelligent systems, 2020, 15(3): 468–474. System architecture of Texas Hold’em based on experience GAO Qiang1 ，XU Xinhe2 ，WANG Hao3 ，BAI Guoli3 ，CAO Ruimin3 (1. Key Laboratory of Manufacturing Industrial Integrated Automation, Shenyang University, Shenyang 110044, China; 2. College of Information Science and Engineering, Northeastern University, Shenyang 110819, China; 3. School of Mechanical Engineering and Automation, Northeastern University, Shenyang 110819, China) Abstract: To improve the level of Texas Hold’em through historical experience, this paper proposes a system architecture of heads-up no-limit Texas Hold’em for the knowledge base module. Mass historic games are used to train the deep learning network based on convolutional neural network, and an expert database is constructed for the search module of the system. Texas Hold’em structured game tree is developed and extended, and it is applied in terms of the expertise and historical experience to the core module for evaluation. A hand-ranking hash-based table is built to reduce the time required to evaluate hand rankings. The experimental result shows a higher playing level for the proposed system architecture. Keywords: Heads-up no-limit Texas Hold’em; computer games; dynamic game with imperfect information; game tree; deep learning; expert database; Hash table; game strategy 德州扑克属于一种典型且复杂的非完全信息动态博弈问题[1-2] ，它是近年来计算机博弈领域的学者们重点研究的热点问题。2006 年，加拿大阿尔伯特大学 (University of Alberta) 作为主办方举办了首届国际计算机扑克大赛[3] ；2007 年，德州扑克博弈系统 Polaris 首次战胜了职业扑克选手[4]。 2015 年 1 月，加拿大阿尔伯特大学在 Science 期刊上发表了一篇关于德州扑克博弈问题最新研究成果的文章[5] ，该研究小组开发了两人参与的有赌注上限的德州扑克博弈系统，并得到了该博弈问题的理论解。但是二人赌注无上限的德州扑克问题，由于具有更高的复杂度 (文献 [6] 证明了此类问题属于 NP-hard 问题)，一直没有实现求解。收稿日期：2018−03−26. 基金项目：辽宁省自然科学基金项目 (20180550146，20170520386). 通信作者：高强. E-mail：tommy_06@163.com. 第 15 卷第 3 期智能系统学报 Vol.15 No.3 2020 年 5 月 CAAI Transactions on Intelligent Systems May 2020

向下翻页>>

点击下载：【智能系统】一种基于经验的德州扑克博弈系统架构