Narasimhan K, Kulkarni _中国高校课件下载中心

正在加载图片...

410 工程科学学报，第42卷，第4期 [49]Narasimhan K,Kulkarni T D,Barzilay R.Language understanding Conference on Artificial Intelligence.New Orleans,2018:5602 for text-based games using deep reinforcement learning / [64]Li J W,Monroe W,Ritter A,et al.Deep reinforcement learning for Proceedings of the Conference on Empirical Methods in Natural dialogue generation[J/OL].arXiv Preprint(2016-09-29)[2019-06- Language Processing.Lisbon,2015:1001 16].https://arxiv.org/abs/1606.01541 [50]Williams R J,Zipser D.A learning algorithm for continually [65]Takanobu R,Huang M,Zhao ZZ,et al.A weakly supervised running fully recurrent neural networks.Neural Comput,1989. method for topic segmentation and labeling in goal-oriented 1(2)上：270 dialogues via reinforcement learning /Proceedings of the Twenty- [51]Hochreiter S,Schmidhuber J.Long short-term memory.Neural Seventh International Joint Conference on Artificial Intelligence. Comput,1997,9(8):1735 Stockholm,2018:4403 [52]He J,Chen J,He X,et al.Deep reinforcement learning with a [66]Bahdanau D,Brakel P,Xu K,et al.An actor-critic algorithm for natural language action space /Proceedings of 54th Anmual sequence prediction[J/OL].arXiv Preprint (2017-03-03)[2019-06- Meeting of the Association for Computational Linguistics.Berlin 16].https://arxiv.org/abs/1607.07086 2016:1621 [67]Su P H,Budzianowski P,Ultes S,et al.Sample-efficient actor- [53]Guo H.Generating text with deep reinforcement leaming[J/OL]. critic reinforcement learning with supervised data for dialogue arXiv Preprint (2015-10-30)[2019-06-16].https://arxiv.org/abs/ management[J/OL].arXiv Preprint (2017-07-05)[2019-06-16]. 1510.09202 https://arxiv.org/abs/1707.00130 [54]Papineni K,Roukos S,Ward T,et al.BLEU:a method for [68]Wang Z Y,Bapst V,Heess N,et al.Sample efficient actor-critic automatic evaluation of machine translation /Proceedings of 40th with experience replay[J/OL].arXiv Preprint (2017-07-10)[2019- Annual Meeting of Association for Computational Linguistics. 06-16].https://arxiv.org/abs/1611.01224 Philadelphia,2002:311 [69]Peters J,Schaal S.Natural actor-critic.Neurocomputing,2008, [55]Sutton R S,McAllester D A,Singh S P,et al.Policy gradient 71(7-9):1180 methods for reinforcement learning with function approximation// [70]Chen L,Su P H,Gasic M.Hyper-parameter optimisation of Advances in Neural Information Processing Systems.Denver, gaussian process reinforcement learning for statistical dialogue 2000:1057 management /Proceedings of the 16th Anmual Meeting of the [56]Ranzato M A.Chopra S,Auli M,et al.Sequence level training Special Interest Group on Discourse and Dialogue.Prague,2015: with recurrent neural networks[J/OL]ar Preprin(2016-05-06) 407 [2019-06-16.htps//arxiv.org/abs/1511.06732 [71]Goodfellow I,Pouget-Abadie J,Mirza M,et al.Generative [57]LiJW,Monroe W.ShiTL,et al.Adversarial learning for neural adversarial nets /l Advances in Neural Information Processing dialogue generation[J/OL].arXiv Preprin(2017-09-24)[2019-06- Systems.Montreal,2014:1 16].https:/arxiv.org/abs/1701.06547 [72]Yu L T,Zhang W N,Wang J,et al.SeqGAN:Sequence generative [58]LinC Y.Rouge:A package for automatic evaluation of summaries adversarial nets with policy gradient /Proceedings of Thirty-First ll Proceedings of Workshop on Text Summarization Branches Out. AAAI Conference on Artificial Intelligence.Palo Alto,2017:2852 Post Conference Workshop of ACL 2004.Barcelona,2004:8 [73]Pfau D,Vinyals O.Connecting generative adversarial networks [59]Rennie S J,Marcheret E,Mroueh Y,et al.Self-critical sequence and actor-critic methods[J/OL].arXiv Preprint (2017-01-18) training for image captioning[J/OL].arXiv Preprint (2017-11-16) [2019-06-16.https://arxiv.org/abs/1610.01945 [2019-06-16].https:/arxiv.org/abs/1612.00563 [74]Serban I V,Sankar C,Germain M,et al.A deep reinforcement [60]Vedantam R,Lawrence Z C,Parikh D.CIDEr:Consensus-based learning chatbot[J/OL].arXiv Preprint (2017-11-05)[2019-06-16]. image description evaluation /Proceedings of the IEEE https://arxiv.org/abs/1709.02349 Conference on Computer Vision and Pattern Recognition.Boston, [75]He D,Lu H Q,Xia Y C,et al.Decoding with value networks for 2015:4566 neural machine translation //Advances in Neural Information [61]Banerjee S,Lavie A.METEOR:An automatic metric for MT Processing Systems.Long Beach,2017:177 evaluation with improved correlation with human judgments / [76]Mnih V,Badia A P,Mirza M,et al.Asynchronous methods for Proceedings of the ACL Workshop on Intrinsic and Extrinsic deep reinforcement leaming Il Proceedings of 33rd Interational Evaluation Measures for Machine Translation and/or Conference on Machine Learning.New York,2016:1928 Summarization.Ann Arbor,2005:65 [77]Casanueva I,Budzianowski P,Su P H,et al.Feudal reinforcement [62]Wang L,Yao J L,Tao Y Z,et al.A reinforced topic-aware learning for dialogue management in large domains /Proceedings convolutional sequence-to-sequence model for abstractive text of the 2018 Conference of the North American Chapter of the summarization I Proceedings of the 27th International Joint Association for Computational Linguistics:Human Language Conference on Artificial Intelligence.Stockholm,2018:4453 Technologies.New Orleans,Louisiana,2018:714 [63]Wu Y X,Hu B T.Learning to extract coherent summary via deep [78]Dayan P,Hinton G E.Feudal reinforcement learning //Advances reinforcement learning /Proceedings of Thirty-Second AAAl in Neural Information Processing Systems.Denver,1993:271Narasimhan K, Kulkarni T D, Barzilay R. Language understanding for text-based games using deep reinforcement learning // Proceedings of the Conference on Empirical Methods in Natural Language Processing. Lisbon, 2015: 1001 [49] Williams R J, Zipser D. A learning algorithm for continually running fully recurrent neural networks. Neural Comput, 1989, 1（2）: 270 [50] Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput, 1997, 9（8）: 1735 [51] He J, Chen J, He X, et al. Deep reinforcement learning with a natural language action space // Proceedings of 54th Annual Meeting of the Association for Computational Linguistics. Berlin, 2016: 1621 [52] Guo H. Generating text with deep reinforcement learning[J/OL]. arXiv Preprint (2015-10-30) [2019-06-16]. https://arxiv.org/abs/ 1510.09202 [53] Papineni K, Roukos S, Ward T, et al. BLEU: a method for automatic evaluation of machine translation // Proceedings of 40th Annual Meeting of Association for Computational Linguistics. Philadelphia, 2002: 311 [54] Sutton R S, McAllester D A, Singh S P, et al. Policy gradient methods for reinforcement learning with function approximation // Advances in Neural Information Processing Systems. Denver, 2000: 1057 [55] Ranzato M A, Chopra S, Auli M, et al. Sequence level training with recurrent neural networks[J/OL]. arXiv Preprint (2016-05-06) [2019-06-16]. https://arxiv.org/abs/1511.06732 [56] Li J W, Monroe W, Shi T L, et al. Adversarial learning for neural dialogue generation[J/OL]. arXiv Preprint (2017-09-24) [2019-06- 16]. https://arxiv.org/abs/1701.06547 [57] Lin C Y. Rouge: A package for automatic evaluation of summaries // Proceedings of Workshop on Text Summarization Branches Out, Post Conference Workshop of ACL 2004. Barcelona, 2004: 8 [58] Rennie S J, Marcheret E, Mroueh Y, et al. Self-critical sequence training for image captioning[J/OL]. arXiv Preprint (2017-11-16) [2019-06-16]. https://arxiv.org/abs/1612.00563 [59] Vedantam R, Lawrence Z C, Parikh D. CIDEr: Consensus-based image description evaluation // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Boston, 2015: 4566 [60] Banerjee S, Lavie A. METEOR: An automatic metric for MT evaluation with improved correlation with human judgments // Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization. Ann Arbor, 2005: 65 [61] Wang L, Yao J L, Tao Y Z, et al. A reinforced topic-aware convolutional sequence-to-sequence model for abstractive text summarization // Proceedings of the 27th International Joint Conference on Artificial Intelligence. Stockholm, 2018: 4453 [62] Wu Y X, Hu B T. Learning to extract coherent summary via deep reinforcement learning // Proceedings of Thirty-Second AAAI [63] Conference on Artificial Intelligence. New Orleans, 2018: 5602 Li J W, Monroe W, Ritter A, et al. Deep reinforcement learning for dialogue generation[J/OL]. arXiv Preprint (2016-09-29) [2019-06- 16]. https://arxiv.org/abs/1606.01541 [64] Takanobu R, Huang M, Zhao Z Z, et al. A weakly supervised method for topic segmentation and labeling in goal-oriented dialogues via reinforcement learning // Proceedings of the TwentySeventh International Joint Conference on Artificial Intelligence. Stockholm, 2018: 4403 [65] Bahdanau D, Brakel P, Xu K, et al. An actor-critic algorithm for sequence prediction[J/OL]. arXiv Preprint (2017-03-03) [2019-06- 16]. https://arxiv.org/abs/1607.07086 [66] Su P H, Budzianowski P, Ultes S, et al. Sample-efficient actorcritic reinforcement learning with supervised data for dialogue management[J/OL]. arXiv Preprint (2017-07-05) [2019-06-16]. https://arxiv.org/abs/1707.00130 [67] Wang Z Y, Bapst V, Heess N, et al. Sample efficient actor-critic with experience replay[J/OL]. arXiv Preprint (2017-07-10) [2019- 06-16]. https://arxiv.org/abs/1611.01224 [68] Peters J, Schaal S. Natural actor-critic. Neurocomputing, 2008, 71（7-9）: 1180 [69] Chen L, Su P H, Gasic M. Hyper-parameter optimisation of gaussian process reinforcement learning for statistical dialogue management // Proceedings of the 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue. Prague, 2015: 407 [70] Goodfellow I, Pouget-Abadie J, Mirza M, et al. Generative adversarial nets // Advances in Neural Information Processing Systems. Montréal, 2014: 1 [71] Yu L T, Zhang W N, Wang J, et al. SeqGAN: Sequence generative adversarial nets with policy gradient // Proceedings of Thirty-First AAAI Conference on Artificial Intelligence. Palo Alto, 2017: 2852 [72] Pfau D, Vinyals O. Connecting generative adversarial networks and actor-critic methods[J/OL]. arXiv Preprint (2017-01-18) [2019-06-16]. https://arxiv.org/abs/1610.01945 [73] Serban I V, Sankar C, Germain M, et al. A deep reinforcement learning chatbot[J/OL]. arXiv Preprint (2017-11-05) [2019-06-16]. https://arxiv.org/abs/1709.02349 [74] He D, Lu H Q, Xia Y C, et al. Decoding with value networks for neural machine translation //Advances in Neural Information Processing Systems. Long Beach, 2017: 177 [75] Mnih V, Badia A P, Mirza M, et al. Asynchronous methods for deep reinforcement learning // Proceedings of 33rd International Conference on Machine Learning. New York, 2016: 1928 [76] Casanueva I, Budzianowski P, Su P H, et al. Feudal reinforcement learning for dialogue management in large domains // Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. New Orleans, Louisiana, 2018: 714 [77] Dayan P, Hinton G E. Feudal reinforcement learning // Advances in Neural Information Processing Systems. Denver, 1993: 271 [78] · 410 · 工程科学学报，第 42 卷，第 4 期

<<向上翻页向下翻页>>

点击下载：《工程科学学报》：文本生成领域的深度强化学习研究进展（北京科技大学）