正在加载图片...
第8卷第4期 智能系统学报 Vol.8 No.4 2013年8月 CAAI Transactions on Intelligent Systems Aug.2013 D0I:10.3969/j.issn.1673-4785.201301029 网络出版地址:http://www.cnki.net/kcms/detail/23.1538.TP.20130603.1602.009.html 情感倾向判断中基准词的选择 程传鹏,王海龙 (中原工学院计算机学院,河南郑州450007)》 摘要:针对已有研究工作中基准词选择的不足之处,提出了一种情感倾向判断中基准词选择的方法依照基准词的 定义,分别从情感度、情感倾向度、情感歧义性3个方面考虑,选择出数量尽可能少、最具有代表性的情感词作为基准 词首先从《知网》所发布的情感词语中,筛选出最初的候选基准词,计算出这些候选基准词的情感度.然后分别计算 出情感度排名靠前的正面情感词和负面情感词的情感倾向度.最后选择情感倾向度较大的词语,作为最终的基准词. 实验结果表明,按照文中所选择的基准词所得到的情感倾向判断的准确率较高 关键词:基准词:情感度:情感倾向度:情感歧义性 中图分类号:TP391文献标志码:A文章编号:1673-4785(2013)04-0349-07 中文引用格式:程传鹏,王海龙.情感倾向判断中基准词的选择[J】.智能系统学报,2013,8(4):349-355. 英文引用格式:CHENG Chuanpeng,WANG Hailong..Research on selection of paradigm words in the judgment of emotional tend- ency[J].CAAI Transactions on Intelligent Systems,2013,8(4):349-355. Research on selection of paradigm words in the judgment of emotional tendency CHENG Chuanpeng,WANG Hailong (School of Computer Science,Zhongyuan Institute of Technology,Zhengzhou 450007,China) Abstract:In light of the weakness of standard words selection,which exists in previous research studies,a method for selecting standard words in the judgment of emotional tendency was proposed in this paper.By considering three key aspects,which include degree of emotion,tendency of emotion and ambiguity of emotion,the most representa- tive typical sentiment words were chosen to act as standard words based on the definition of standard words.Firstly, initial standard words were screened from emotion words issued in HowNet,the degree of emotion of these candidate standard words was computed,then the emotional tendency degree of the positive emotion words and negative emo- tion words,which have high ranking of emotion degree,was calculated,respectively.Finally,the larger emotional tendency words were used as final standard words.The results of test shows the methods used in this paper can gain a high accuracy for judging the tendency of emotion. Keywords:standard words;degree of emotion;tendency of emotion;ambiguity of emotion 基准词集是指褒贬义倾向非常明显、强烈、具有 个词汇作为情感基准词.文献[3]首先从《知网》中 代表性的词汇所构成的集合口.情感倾向判断中,基 选择出2146个褒义词和3299个贬义词,从这些词 准词是衡量其他词语情感倾向的一个参照物,因此, 语中选择没有褒贬歧义的常用词语作为测试集,然 基准词的选择对正确地判断情感的倾向有着至关重 后将测试集中的词语在Google中搜索,返回hits 要的作用.在文献[2]中,Turney通过分析词汇上下 数,即它们在Wb上的词频降序排列,选取词频最 文信息研究其情感倾向,采用P-R方法,使用2高的280个词语作为基准词.文献[4]将文献[3]中 个词汇作为种子来判断其他短语的语义倾向.之后 所选出的基准词中语义重复的词替换成新的、有较 又将单对种子扩展成多对种子,选取了正反面各7 高hits数的褒义词或贬义词,最后得到新的40组褒 贬基准词,这40组褒贬基准词的特点是在保留较高 收稿日期:2013-01-16.网络出版日期:2013-06-03 基金项目:云南省科技计划资助项目(2011FZ074). 的使用频率外,排除了语义相同的情况,提高了词汇 通信作者:程传鹏.E-mail:cheng8444@sina.com 的覆盖面.文献[5]通过词的聚类,将中文词聚类应第 8 卷第 4 期 智 能 系 统 学 报 Vol.8 №.4 2013 年 8 月 CAAI Transactions on Intelligent Systems Aug. 2013 DOI:10.3969 / j.issn.1673⁃4785.201301029 网络出版地址:http: / / www.cnki.net / kcms/ detail / 23.1538.TP.20130603.1602.009.html 情感倾向判断中基准词的选择 程传鹏,王海龙 (中原工学院 计算机学院,河南 郑州 450007) 摘 要:针对已有研究工作中基准词选择的不足之处,提出了一种情感倾向判断中基准词选择的方法.依照基准词的 定义,分别从情感度、情感倾向度、情感歧义性 3 个方面考虑,选择出数量尽可能少、最具有代表性的情感词作为基准 词.首先从《知网》所发布的情感词语中,筛选出最初的候选基准词,计算出这些候选基准词的情感度.然后分别计算 出情感度排名靠前的正面情感词和负面情感词的情感倾向度.最后选择情感倾向度较大的词语,作为最终的基准词. 实验结果表明,按照文中所选择的基准词所得到的情感倾向判断的准确率较高. 关键词:基准词;情感度;情感倾向度;情感歧义性 中图分类号: TP391 文献标志码:A 文章编号:1673⁃4785(2013)04⁃0349⁃07 中文引用格式:程传鹏,王海龙. 情感倾向判断中基准词的选择[J]. 智能系统学报, 2013, 8(4): 349⁃355. 英文引用格式:CHENG Chuanpeng, WANG Hailong. Research on selection of paradigm words in the judgment of emotional tend⁃ ency[J]. CAAI Transactions on Intelligent Systems, 2013, 8(4): 349⁃355. Research on selection of paradigm words in the judgment of emotional tendency CHENG Chuanpeng, WANG Hailong (School of Computer Science, Zhongyuan Institute of Technology, Zhengzhou 450007, China) Abstract:In light of the weakness of standard words selection, which exists in previous research studies, a method for selecting standard words in the judgment of emotional tendency was proposed in this paper. By considering three key aspects, which include degree of emotion, tendency of emotion and ambiguity of emotion, the most representa⁃ tive typical sentiment words were chosen to act as standard words based on the definition of standard words. Firstly, initial standard words were screened from emotion words issued in HowNet, the degree of emotion of these candidate standard words was computed, then the emotional tendency degree of the positive emotion words and negative emo⁃ tion words, which have high ranking of emotion degree, was calculated, respectively. Finally, the larger emotional tendency words were used as final standard words. The results of test shows the methods used in this paper can gain a high accuracy for judging the tendency of emotion. Keywords:standard words; degree of emotion; tendency of emotion; ambiguity of emotion 收稿日期:2013⁃01⁃16. 网络出版日期:2013⁃06⁃03. 基金项目:云南省科技计划资助项目(2011FZ074). 通信作者:程传鹏. E⁃mail:cheng8444@ sina.com. 基准词集是指褒贬义倾向非常明显、强烈、具有 代表性的词汇所构成的集合[1] .情感倾向判断中,基 准词是衡量其他词语情感倾向的一个参照物,因此, 基准词的选择对正确地判断情感的倾向有着至关重 要的作用.在文献[2]中,Turney 通过分析词汇上下 文信息研究其情感倾向,采用 PMI⁃IR 方法,使用 2 个词汇作为种子来判断其他短语的语义倾向.之后 又将单对种子扩展成多对种子,选取了正反面各 7 个词汇作为情感基准词.文献[3]首先从《知网》 中 选择出 2 146 个褒义词和 3 299 个贬义词,从这些词 语中选择没有褒贬歧义的常用词语作为测试集,然 后将测试集中的词语在 Google 中搜索,返回 hits 数,即它们在 Web 上的词频降序排列, 选取词频最 高的 280 个词语作为基准词.文献[4]将文献[3]中 所选出的基准词中语义重复的词替换成新的、有较 高 hits 数的褒义词或贬义词,最后得到新的 40 组褒 贬基准词,这 40 组褒贬基准词的特点是在保留较高 的使用频率外,排除了语义相同的情况,提高了词汇 的覆盖面.文献[5]通过词的聚类,将中文词聚类应
向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有