正在加载图片...
410 工程科学学报,第42卷,第4期 [49]Narasimhan K,Kulkarni T D,Barzilay R.Language understanding Conference on Artificial Intelligence.New Orleans,2018:5602 for text-based games using deep reinforcement learning / [64]Li J W,Monroe W,Ritter A,et al.Deep reinforcement learning for Proceedings of the Conference on Empirical Methods in Natural dialogue generation[J/OL].arXiv Preprint(2016-09-29)[2019-06- Language Processing.Lisbon,2015:1001 16].https://arxiv.org/abs/1606.01541 [50]Williams R J,Zipser D.A learning algorithm for continually [65]Takanobu R,Huang M,Zhao ZZ,et al.A weakly supervised running fully recurrent neural networks.Neural Comput,1989. method for topic segmentation and labeling in goal-oriented 1(2)上:270 dialogues via reinforcement learning /Proceedings of the Twenty- [51]Hochreiter S,Schmidhuber J.Long short-term memory.Neural Seventh International Joint Conference on Artificial Intelligence. Comput,1997,9(8):1735 Stockholm,2018:4403 [52]He J,Chen J,He X,et al.Deep reinforcement learning with a [66]Bahdanau D,Brakel P,Xu K,et al.An actor-critic algorithm for natural language action space /Proceedings of 54th Anmual sequence prediction[J/OL].arXiv Preprint (2017-03-03)[2019-06- Meeting of the Association for Computational Linguistics.Berlin 16].https://arxiv.org/abs/1607.07086 2016:1621 [67]Su P H,Budzianowski P,Ultes S,et al.Sample-efficient actor- [53]Guo H.Generating text with deep reinforcement leaming[J/OL]. critic reinforcement learning with supervised data for dialogue arXiv Preprint (2015-10-30)[2019-06-16].https://arxiv.org/abs/ management[J/OL].arXiv Preprint (2017-07-05)[2019-06-16]. 1510.09202 https://arxiv.org/abs/1707.00130 [54]Papineni K,Roukos S,Ward T,et al.BLEU:a method for [68]Wang Z Y,Bapst V,Heess N,et al.Sample efficient actor-critic automatic evaluation of machine translation /Proceedings of 40th with experience replay[J/OL].arXiv Preprint (2017-07-10)[2019- Annual Meeting of Association for Computational Linguistics. 06-16].https://arxiv.org/abs/1611.01224 Philadelphia,2002:311 [69]Peters J,Schaal S.Natural actor-critic.Neurocomputing,2008, [55]Sutton R S,McAllester D A,Singh S P,et al.Policy gradient 71(7-9):1180 methods for reinforcement learning with function approximation// [70]Chen L,Su P H,Gasic M.Hyper-parameter optimisation of Advances in Neural Information Processing Systems.Denver, gaussian process reinforcement learning for statistical dialogue 2000:1057 management /Proceedings of the 16th Anmual Meeting of the [56]Ranzato M A.Chopra S,Auli M,et al.Sequence level training Special Interest Group on Discourse and Dialogue.Prague,2015: with recurrent neural networks[J/OL]ar Preprin(2016-05-06) 407 [2019-06-16.htps//arxiv.org/abs/1511.06732 [71]Goodfellow I,Pouget-Abadie J,Mirza M,et al.Generative [57]LiJW,Monroe W.ShiTL,et al.Adversarial learning for neural adversarial nets /l Advances in Neural Information Processing dialogue generation[J/OL].arXiv Preprin(2017-09-24)[2019-06- Systems.Montreal,2014:1 16].https:/arxiv.org/abs/1701.06547 [72]Yu L T,Zhang W N,Wang J,et al.SeqGAN:Sequence generative [58]LinC Y.Rouge:A package for automatic evaluation of summaries adversarial nets with policy gradient /Proceedings of Thirty-First ll Proceedings of Workshop on Text Summarization Branches Out. AAAI Conference on Artificial Intelligence.Palo Alto,2017:2852 Post Conference Workshop of ACL 2004.Barcelona,2004:8 [73]Pfau D,Vinyals O.Connecting generative adversarial networks [59]Rennie S J,Marcheret E,Mroueh Y,et al.Self-critical sequence and actor-critic methods[J/OL].arXiv Preprint (2017-01-18) training for image captioning[J/OL].arXiv Preprint (2017-11-16) [2019-06-16.https://arxiv.org/abs/1610.01945 [2019-06-16].https:/arxiv.org/abs/1612.00563 [74]Serban I V,Sankar C,Germain M,et al.A deep reinforcement [60]Vedantam R,Lawrence Z C,Parikh D.CIDEr:Consensus-based learning chatbot[J/OL].arXiv Preprint (2017-11-05)[2019-06-16]. image description evaluation /Proceedings of the IEEE https://arxiv.org/abs/1709.02349 Conference on Computer Vision and Pattern Recognition.Boston, [75]He D,Lu H Q,Xia Y C,et al.Decoding with value networks for 2015:4566 neural machine translation //Advances in Neural Information [61]Banerjee S,Lavie A.METEOR:An automatic metric for MT Processing Systems.Long Beach,2017:177 evaluation with improved correlation with human judgments / [76]Mnih V,Badia A P,Mirza M,et al.Asynchronous methods for Proceedings of the ACL Workshop on Intrinsic and Extrinsic deep reinforcement leaming Il Proceedings of 33rd Interational Evaluation Measures for Machine Translation and/or Conference on Machine Learning.New York,2016:1928 Summarization.Ann Arbor,2005:65 [77]Casanueva I,Budzianowski P,Su P H,et al.Feudal reinforcement [62]Wang L,Yao J L,Tao Y Z,et al.A reinforced topic-aware learning for dialogue management in large domains /Proceedings convolutional sequence-to-sequence model for abstractive text of the 2018 Conference of the North American Chapter of the summarization I Proceedings of the 27th International Joint Association for Computational Linguistics:Human Language Conference on Artificial Intelligence.Stockholm,2018:4453 Technologies.New Orleans,Louisiana,2018:714 [63]Wu Y X,Hu B T.Learning to extract coherent summary via deep [78]Dayan P,Hinton G E.Feudal reinforcement learning //Advances reinforcement learning /Proceedings of Thirty-Second AAAl in Neural Information Processing Systems.Denver,1993:271Narasimhan K, Kulkarni T D, Barzilay R. Language understanding for  text-based  games  using  deep  reinforcement  learning  // Proceedings of the Conference on Empirical Methods in Natural Language Processing. Lisbon, 2015: 1001 [49] Williams  R  J,  Zipser  D.  A  learning  algorithm  for  continually running  fully  recurrent  neural  networks. Neural Comput,  1989, 1(2): 270 [50] Hochreiter  S,  Schmidhuber  J.  Long  short-term  memory. Neural Comput, 1997, 9(8): 1735 [51] He  J,  Chen  J,  He  X,  et  al.  Deep  reinforcement  learning  with  a natural  language  action  space  // Proceedings of 54th Annual Meeting of the Association for Computational Linguistics. Berlin, 2016: 1621 [52] Guo  H.  Generating  text  with  deep  reinforcement  learning[J/OL]. arXiv Preprint (2015-10-30)  [2019-06-16]. https://arxiv.org/abs/ 1510.09202 [53] Papineni  K,  Roukos  S,  Ward  T,  et  al.  BLEU:  a  method  for automatic evaluation of machine translation // Proceedings of 40th Annual Meeting of Association for Computational Linguistics. Philadelphia, 2002: 311 [54] Sutton  R  S,  McAllester  D  A,  Singh  S  P,  et  al.  Policy  gradient methods for reinforcement learning with function approximation // Advances in Neural Information Processing Systems.  Denver, 2000: 1057 [55] Ranzato  M  A,  Chopra  S,  Auli  M,  et  al.  Sequence  level  training with recurrent neural networks[J/OL]. arXiv Preprint (2016-05-06) [2019-06-16]. https://arxiv.org/abs/1511.06732 [56] Li J W, Monroe W, Shi T L, et al. Adversarial learning for neural dialogue generation[J/OL]. arXiv Preprint (2017-09-24) [2019-06- 16]. https://arxiv.org/abs/1701.06547 [57] Lin C Y. Rouge: A package for automatic evaluation of summaries // Proceedings of Workshop on Text Summarization Branches Out, Post Conference Workshop of ACL 2004. Barcelona, 2004: 8 [58] Rennie  S  J,  Marcheret  E,  Mroueh  Y,  et  al.  Self-critical  sequence training for image captioning[J/OL]. arXiv Preprint (2017-11-16) [2019-06-16]. https://arxiv.org/abs/1612.00563 [59] Vedantam  R,  Lawrence  Z  C,  Parikh  D.  CIDEr:  Consensus-based image  description  evaluation  // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Boston, 2015: 4566 [60] Banerjee  S,  Lavie  A.  METEOR:  An  automatic  metric  for  MT evaluation  with  improved  correlation  with  human  judgments  // Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization. Ann Arbor, 2005: 65 [61] Wang  L,  Yao  J  L,  Tao  Y  Z,  et  al.  A  reinforced  topic-aware convolutional  sequence-to-sequence  model  for  abstractive  text summarization  // Proceedings of the 27th International Joint Conference on Artificial Intelligence. Stockholm, 2018: 4453 [62] Wu Y X, Hu B T. Learning to extract coherent summary via deep reinforcement  learning  // Proceedings of Thirty-Second AAAI [63] Conference on Artificial Intelligence. New Orleans, 2018: 5602 Li J W, Monroe W, Ritter A, et al. Deep reinforcement learning for dialogue generation[J/OL]. arXiv Preprint (2016-09-29) [2019-06- 16]. https://arxiv.org/abs/1606.01541 [64] Takanobu  R,  Huang  M,  Zhao  Z  Z,  et  al.  A  weakly  supervised method  for  topic  segmentation  and  labeling  in  goal-oriented dialogues via reinforcement learning // Proceedings of the Twenty￾Seventh International Joint Conference on Artificial Intelligence. Stockholm, 2018: 4403 [65] Bahdanau D, Brakel P, Xu K, et al. An actor-critic algorithm for sequence prediction[J/OL]. arXiv Preprint (2017-03-03) [2019-06- 16]. https://arxiv.org/abs/1607.07086 [66] Su  P  H,  Budzianowski  P,  Ultes  S,  et  al.  Sample-efficient  actor￾critic  reinforcement  learning  with  supervised  data  for  dialogue management[J/OL]. arXiv Preprint (2017-07-05)  [2019-06-16]. https://arxiv.org/abs/1707.00130 [67] Wang Z Y, Bapst V, Heess N, et al. Sample efficient actor-critic with experience replay[J/OL]. arXiv Preprint (2017-07-10) [2019- 06-16]. https://arxiv.org/abs/1611.01224 [68] Peters  J,  Schaal  S.  Natural  actor-critic. Neurocomputing,  2008, 71(7-9): 1180 [69] Chen  L,  Su  P  H,  Gasic  M.  Hyper-parameter  optimisation  of gaussian  process  reinforcement  learning  for  statistical  dialogue management  // Proceedings of the 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue. Prague, 2015: 407 [70] Goodfellow  I,  Pouget-Abadie  J,  Mirza  M,  et  al.  Generative adversarial  nets  // Advances in Neural Information Processing Systems. Montréal, 2014: 1 [71] Yu L T, Zhang W N, Wang J, et al. SeqGAN: Sequence generative adversarial nets with policy gradient // Proceedings of Thirty-First AAAI Conference on Artificial Intelligence. Palo Alto, 2017: 2852 [72] Pfau  D,  Vinyals  O.  Connecting  generative  adversarial  networks and  actor-critic  methods[J/OL]. arXiv Preprint (2017-01-18) [2019-06-16]. https://arxiv.org/abs/1610.01945 [73] Serban  I  V,  Sankar  C,  Germain  M,  et  al.  A  deep  reinforcement learning chatbot[J/OL]. arXiv Preprint (2017-11-05) [2019-06-16]. https://arxiv.org/abs/1709.02349 [74] He D, Lu H Q, Xia Y C, et al. Decoding with value networks for neural  machine  translation  //Advances in Neural Information Processing Systems. Long Beach, 2017: 177 [75] Mnih  V,  Badia  A  P,  Mirza  M,  et  al.  Asynchronous  methods  for deep  reinforcement  learning  // Proceedings of 33rd International Conference on Machine Learning. New York, 2016: 1928 [76] Casanueva I, Budzianowski P, Su P H, et al. Feudal reinforcement learning for dialogue management in large domains // Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. New Orleans, Louisiana, 2018: 714 [77] Dayan P, Hinton G E. Feudal reinforcement learning // Advances in Neural Information Processing Systems. Denver, 1993: 271 [78] · 410 · 工程科学学报,第 42 卷,第 4 期
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有