正在加载图片...
·806· 智能系统学报 第13卷 5结束语 spam detection[J].Applied soft computing,2011,11(4): 3827-3845. 本文在前期系列研究中所搭建的哈萨克文语 [10]MAO Ming,PENG Yefei,SORING M.Ontology map- 料集和词干提取程序的优化完善基础上实现了哈 ping:As a binary classification problem[J].Concurrency 萨克语文本的预处理。分类任务的实现上运用了 and computation:practice and experience,2011,23(9): 模式识别的3种分类算法,并对3种分类算法分 1010-1025. 类精度进行了较全面的对比分析。通过仿真实验 [11]YANG Yiming,SLATTERY S,GHANI R.A study of 客观数字的对比分析,说明本文提出算法的优越 approaches to hypertext categorization[J].Journal of intel- 性。本文算法对所有类别文档词的召回率和区分 ligent information systems,2002,18(2/3):219-241. 度较稳定。本文算法在继承SVM算法的分类优 [12]REN Fuji,LI Chao.Hybrid Chinese text classification ap- 越性基础上,还有效避免了KNN算法设置k参数 proach using general knowledge from Baidu Baike[J]. IEEJ transactions on electrical and electronic engineering, 的麻烦和跟所有训练样本进行距离计算而带来的 2016,11(4):488-498 巨大时间复杂度,进而保证了分类算法的收敛速度。 [13]DUWAIRI R,EL-ORFALI M.A study of the effects of 本研究仍有许多待优化完善的问题,本文接 preprocessing strategies on sentiment analysis for Arabic 下来的研究工作中将系统地研究并解决影响文本 text[J].Journal of information science,2014,40(4): 分类精度的阶段性问题,获得满意的分类精度。 501-513. 参考文献 [14]张冬梅.文本情感分类及观点摘要关键问题研究D]济 南:山东大学,2012 [1]SEBASTIANI F.Machine learning in automated text cat- ZHANG Dongmei.Research on key problems in text sen- egorization[J].ACM computing surveys,2002,34(1): timent classification and opinion summarization[D] 1-47 Ji'nan:Shandong University,2012 [2]AHMADI A,FOTOUHI M,KHALEGHI M.Intelligent [15]杨杰明.文本分类中文本表示模型和特征选择算法研 classification of web pages using contextual and visual fea- 究D1.长春:吉林大学,2013. tures[J].Applied soft computing,2011,11(2):1638-1647. YANG Jieming.The research of text representation and [3]MARTINEZ-CAMARA E,MARTIN-VALDIVIA M T, feature selection in text categorization[D].Changchun: URENA-LOPEZ L A,et al.Polarity classification for Jilin University,2013. Spanish tweets using the COST corpus[J].Journal of in- [16]张晓娜.CNNIC发布第37次中国互联网络发展状况统 formation science,2015,41(3):263-272 计报告N]民主与法制时报,2016-01-23(001) [4]PERCANNELLA G.SORRENTINO D,VENTO M.Auto- [17]SYIAM MM.FAYED Z T.HABIB M B.An intelligent matic indexing of news videos through text classification system for Arabic text categorization[J].International techniques[M]//SINGH S,SINGH M,APTE C,et al.Pat- journal of cooperative information systems,2006,6(1): tern Recognition and Image Analysis.Berlin:Springer, 1-19 2005:512-521. [18]DUWAIRI R.AL-REFAI M.KHASAWNEH N.Stem- [5]HU Rong,NAMEE B M,DELANY S J.Active learning ming versus light stemming as feature selection tech- for text classification with reusability[J].Expert systems niques for Arabic text categorization[C]//Proceedings of with applications,2016,45:438-449 the 4th International Conference on Innovations in In- [6]SAKURAI S,SUYAMA A.An e-mail analysis method formation Technology.Dubai,2007:446-450. based on text mining techniques[J].Applied soft comput- [19)]贺慧,王俊义.主动支持向量机的研究及其在蒙文文本 ing,2005,6(1:62-71. 分类中的应用).内蒙古大学学报:自然科学版,2006, [7]AL-KABI M,WAHSHEH H,ALSMADI I,et al.Content- 37(5)560-563 based analysis to detect Arabic web spam[J].Journal of in- HE Hui,WANG Junyi.Study of active learning support formation science,2012,38(3):284-296. vector machine and its application on mongolian text clas- [8]ZITAR R A,MOHAMMAD A H.Spam detection using sification].Acta scientiarum naturalium universitatis genetic assisted artificial immune system[J].International neimongol,2006.37(5):560-563. journal of pattern recognition and artificial intelligence, [20]ADELEKE A O.SAMSUDIN N A,MUSTAPHA A,et 2011,25(8):1275-1295. al.Comparative analysis of text classification algorithms [9]MOHAMMAD A H,ZITAR R A.Application of genetic for automated labelling of quranic verses[J].International optimized artificial immune system and neural networks in journal on advanced science engineering information5 结束语 本文在前期系列研究中所搭建的哈萨克文语 料集和词干提取程序的优化完善基础上实现了哈 萨克语文本的预处理。分类任务的实现上运用了 模式识别的 3 种分类算法,并对 3 种分类算法分 类精度进行了较全面的对比分析。通过仿真实验 客观数字的对比分析,说明本文提出算法的优越 性。本文算法对所有类别文档词的召回率和区分 度较稳定。本文算法在继承 SVM 算法的分类优 越性基础上,还有效避免了 KNN 算法设置 k 参数 的麻烦和跟所有训练样本进行距离计算而带来的 巨大时间复杂度,进而保证了分类算法的收敛速度。 本研究仍有许多待优化完善的问题,本文接 下来的研究工作中将系统地研究并解决影响文本 分类精度的阶段性问题,获得满意的分类精度。 参考文献: SEBASTIANI F. Machine learning in automated text cat￾egorization[J]. ACM computing surveys, 2002, 34(1): 1–47. [1] AHMADI A, FOTOUHI M, KHALEGHI M. Intelligent classification of web pages using contextual and visual fea￾tures[J]. Applied soft computing, 2011, 11(2): 1638–1647. [2] MARTÍNEZ-CÁMARA E, MARTÍN-VALDIVIA M T, UREÑA-LÓPEZ L A, et al. Polarity classification for Spanish tweets using the COST corpus[J]. Journal of in￾formation science, 2015, 41(3): 263–272. [3] PERCANNELLA G, SORRENTINO D, VENTO M. Auto￾matic indexing of news videos through text classification techniques[M]//SINGH S, SINGH M, APTE C, et al. Pat￾tern Recognition and Image Analysis. Berlin: Springer, 2005: 512–521. [4] HU Rong, NAMEE B M, DELANY S J. Active learning for text classification with reusability[J]. Expert systems with applications, 2016, 45: 438–449. [5] SAKURAI S, SUYAMA A. An e-mail analysis method based on text mining techniques[J]. Applied soft comput￾ing, 2005, 6(1): 62–71. [6] AL-KABI M, WAHSHEH H, ALSMADI I, et al. Content￾based analysis to detect Arabic web spam[J]. Journal of in￾formation science, 2012, 38(3): 284–296. [7] ZITAR R A, MOHAMMAD A H. Spam detection using genetic assisted artificial immune system[J]. International journal of pattern recognition and artificial intelligence, 2011, 25(8): 1275–1295. [8] MOHAMMAD A H, ZITAR R A. Application of genetic optimized artificial immune system and neural networks in [9] spam detection[J]. Applied soft computing, 2011, 11(4): 3827–3845. MAO Ming, PENG Yefei, SORING M. Ontology map￾ping: As a binary classification problem[J]. Concurrency and computation: practice and experience, 2011, 23(9): 1010–1025. [10] YANG Yiming, SLATTERY S, GHANI R. A study of approaches to hypertext categorization[J]. Journal of intel￾ligent information systems, 2002, 18(2/3): 219–241. [11] REN Fuji, LI Chao. Hybrid Chinese text classification ap￾proach using general knowledge from Baidu Baike[J]. IEEJ transactions on electrical and electronic engineering, 2016, 11(4): 488–498. [12] DUWAIRI R, EL-ORFALI M. A study of the effects of preprocessing strategies on sentiment analysis for Arabic text[J]. Journal of information science, 2014, 40(4): 501–513. [13] 张冬梅. 文本情感分类及观点摘要关键问题研究[D]. 济 南: 山东大学, 2012. ZHANG Dongmei. Research on key problems in text sen￾timent classification and opinion summarization[D]. Ji’nan: Shandong University, 2012. [14] 杨杰明. 文本分类中文本表示模型和特征选择算法研 究[D]. 长春: 吉林大学, 2013. YANG Jieming. The research of text representation and feature selection in text categorization[D]. Changchun: Jilin University, 2013. [15] 张晓娜. CNNIC 发布第 37 次中国互联网络发展状况统 计报告[N]. 民主与法制时报, 2016-01-23(001). [16] SYIAM M M, FAYED Z T, HABIB M B. An intelligent system for Arabic text categorization[J]. International journal of cooperative information systems, 2006, 6(1): 1–19. [17] DUWAIRI R, AL-REFAI M, KHASAWNEH N. Stem￾ming versus light stemming as feature selection tech￾niques for Arabic text categorization[C]//Proceedings of the 4th International Conference on Innovations in In￾formation Technology. Dubai, 2007: 446–450. [18] 贺慧, 王俊义. 主动支持向量机的研究及其在蒙文文本 分类中的应用[J]. 内蒙古大学学报: 自然科学版, 2006, 37(5): 560–563. HE Hui, WANG Junyi. Study of active learning support vector machine and its application on mongolian text clas￾sification[J]. Acta scientiarum naturalium universitatis neimongol, 2006, 37(5): 560–563. [19] ADELEKE A O, SAMSUDIN N A, MUSTAPHA A, et al. Comparative analysis of text classification algorithms for automated labelling of quranic verses[J]. International journal on advanced science engineering information [20] ·806· 智 能 系 统 学 报 第 13 卷
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有