正在加载图片...
.316. 智能系统学报 第11卷 表48个算法在人工数据集的对比 Table4 Comparison of 8 algorithms on artificial data sets 算法 数据集 评价指标 LSSMTC Co-Clustering FPCM TSC T-GIFP-FCM SS-FPCM TSS-FPCM ITSS-FPCM F-measure 0.6834 0.6648 0.63310.7688 0.6956 0.6984 0.7187 0.7336 RI 0.5585 0.5550 0.52410.6450 0.5770 0.5750 0.5958 0.6095 sciSet1 AC 0.8165 0.6675 0.75000.7700 0.6975 0.6950 0.7200 0.7350 NMI 0.1341 0.1021 0.11890.2923 0.1483 0.1098 0.1342 0.1564 F-measure 0.6867 0.6394 0.69800.8827 0.8907 0.8311 0.8469 0.9158 RI 0.5803 0.5395 0.57690.7921 0.8037 0.7204 0.7409 0.8440 rec vs talk AC 0.7053 0.6425 0.6975 0.8825 0.8900 0.8325 0.8475 0.9150 NMI 0.1769 0.0871 0.09930.4637 0.4873 0.3492 0.3750 0.5748 F-measure 0.6427 0.6139 0.47870.8554 0.8897 0.8214 0.8253 0.8858 RI 0.7828 0.7473 0.6825 0.9070 0.9299 0.8845 0.8884 0.9300 TDT2 AC 0.6983 0.7133 0.60830.8633 0.8967 0.8333 0.8350 0.8883 NMI 0.5426 0.5750 0.39800.7535 0.8093 0.7199 0.7217 0.8298 F-measure 0.7101 0.6840 0.6361 0.8247 0.8533 0.8121 0.8178 0.8608 RI 0.8125 0.7153 0.66200.8419 0.8658 0.8323 0.8376 0.8709 Reuters-21578 AC 0.8200 0.7275 0.71910.8300 0.8550 0.8150 0.8200 0.8650 NMI 0.5662 0.5052 0.44850.6590 0.6430 0.6162 0.62420.7076 4 结束语 mathematics and information sciences,2014,8(4):2033- 2040 本文将半监督学习思想应用到FPCM算法上, [3]DAI Wenyuan,XUE Guirong,YANG Qiang,et al.Co- 提出半监督SS-FPCM算法:迁移学习方面对算法进 clustering based classification for out-of-domain documents 行非负迁移改进,得到TSS-FPCM算法,再利用“代 [C]//Proceedings of the 13th ACM SIGKDD Tinternational Conference on Knowledge Discovery and Data Mining.San 表点”代替原始数据提出了改进的半监督的迁移聚 Jose,California,USA,2007:210-219. 类算法TSS-FPCM。在多种数据集上的实验验证表 [4]DAI Wenyuan,YANG Qiang,XUE Guirong,et al.Self- 明,TSS-FPCM算法在性能上要好于SS-FPCM算法 taught clustering[C]//Proceedings of the 25th International 与TSS-FPCM算法。在数据量不足、数据被污染的 Conference on Machine Learning.Helsinki,Finland,, 情况下,TSS-FPCM算法能够提升聚类的性能:算法 2008:200-207. 在源数据与目标数据相关不大时效果一般,下一步 [5]SAMANTA S,SELVAN A T,DAS S.Cross-domain cluste- 研究将会提取其他相关信息改善聚类性能,同时考 ring performed by transfer of knowledge across domains 虑参数的优化问题。 [C]//Proceedings of the 4th National Conference on Pat- tern Recognition,Image Processing and Graphics 参考文献: (NCVPRIPG).Jodhpur,India,2013:1-4. [6]DAI Wenyuan,XUE Guirong,YANG Qiang,et al.Trans- [1]庄福振,罗平,何清,等.迁移学习研究进展[J].软件 ferring naive Bayes classifiers for text classification[C]/ 学报,2015,26(1):26-39. Proceedings of the 22nd National Conference on Artificial ZHUANG Fuzhen,LUO Ping,HE Qing,et al.Survey on Intelligence.Vancourver,British Columbia,Canada,2007, transfer learning research[]].Journal of software,2015,26 1:540-545. (1):26-39. [7]LIAO Xuejun,XUE Ya,CARIN L.Logistic regression with [2]WEI Fengmei,ZHANG Jianpei,CHU Yan,et al.FSFP: an auxiliary data source[C]//Proceedings of the 22nd In- transfer learning from long texts to the short[J].Applied ternational Conference on Machine Leaming.New York,表 4 8 个算法在人工数据集的对比 Table4 Comparison of 8 algorithms on artificial data sets 数据集 评价指标 算法 LSSMTC Co⁃Clustering FPCM TSC T⁃GIFP⁃FCM SS⁃FPCM TSS⁃FPCM ITSS⁃FPCM sciSet1 F⁃measure 0.683 4 0.664 8 0.633 1 0.768 8 0.695 6 0.698 4 0.718 7 0.733 6 RI 0.558 5 0.555 0 0.524 1 0.645 0 0.577 0 0.575 0 0.595 8 0.609 5 AC 0.816 5 0.667 5 0.750 0 0.770 0 0.697 5 0.695 0 0.720 0 0.735 0 NMI 0.134 1 0.102 1 0.118 9 0.292 3 0.148 3 0.109 8 0.134 2 0.156 4 rec vs talk F⁃measure 0.686 7 0.639 4 0.698 0 0.882 7 0.890 7 0.831 1 0.846 9 0.915 8 RI 0.580 3 0.539 5 0.576 9 0.792 1 0.803 7 0.720 4 0.740 9 0.844 0 AC 0.705 3 0.642 5 0.697 5 0.882 5 0.890 0 0.832 5 0.847 5 0.915 0 NMI 0.176 9 0.087 1 0.0993 0.463 7 0.487 3 0.349 2 0.375 0 0.574 8 TDT2 F⁃measure 0.6427 0.613 9 0.478 7 0.855 4 0.8897 0.821 4 0.825 3 0.885 8 RI 0.782 8 0.747 3 0.682 5 0.907 0 0.9299 0.884 5 0.888 4 0.930 0 AC 0.698 3 0.713 3 0.608 3 0.863 3 0.8967 0.833 3 0.835 0 0.888 3 NMI 0.542 6 0.575 0 0.398 0 0.753 5 0.8093 0.719 9 0.721 7 0.829 8 Reuters⁃21578 F⁃measure 0.710 1 0.684 0 0.6361 0.824 7 0.8533 0.812 1 0.817 8 0.860 8 RI 0.812 5 0.715 3 0.6620 0.841 9 0.8658 0.832 3 0.837 6 0.870 9 AC 0.820 0 0.727 5 0.719 1 0.830 0 0.8550 0.815 0 0.820 0 0.865 0 NMI 0.566 2 0.505 2 0.448 5 0.659 0 0.6430 0.616 2 0.624 2 0.707 6 4 结束语 本文将半监督学习思想应用到 FPCM 算法上, 提出半监督 SS⁃FPCM 算法;迁移学习方面对算法进 行非负迁移改进,得到 TSS⁃FPCM 算法,再利用“代 表点”代替原始数据提出了改进的半监督的迁移聚 类算法 ITSS⁃FPCM。 在多种数据集上的实验验证表 明,ITSS⁃FPCM 算法在性能上要好于 SS⁃FPCM 算法 与 TSS⁃FPCM 算法。 在数据量不足、数据被污染的 情况下,ITSS⁃FPCM 算法能够提升聚类的性能;算法 在源数据与目标数据相关不大时效果一般,下一步 研究将会提取其他相关信息改善聚类性能,同时考 虑参数的优化问题。 参考文献: [1]庄福振, 罗平, 何清, 等. 迁移学习研究进展[ J]. 软件 学报, 2015, 26(1): 26⁃39. ZHUANG Fuzhen, LUO Ping, HE Qing, et al. Survey on transfer learning research[J]. Journal of software, 2015, 26 (1): 26⁃39. [2] WEI Fengmei, ZHANG Jianpei, CHU Yan, et al. FSFP: transfer learning from long texts to the short [ J]. Applied mathematics and information sciences, 2014, 8(4): 2033⁃ 2040. [3] DAI Wenyuan, XUE Guirong, YANG Qiang, et al. Co⁃ clustering based classification for out⁃of⁃domain documents [C] / / Proceedings of the 13th ACM SIGKDD Tinternational Conference on Knowledge Discovery and Data Mining. San Jose, California, USA, 2007: 210⁃219. [4] DAI Wenyuan, YANG Qiang, XUE Guirong, et al. Self⁃ taught clustering[C] / / Proceedings of the 25th International Conference on Machine Learning. Helsinki, Finland,, 2008: 200⁃207. [5]SAMANTA S, SELVAN A T, DAS S. Cross⁃domain cluste⁃ ring performed by transfer of knowledge across domains [C] / / Proceedings of the 4th National Conference on Pat⁃ tern Recognition, Image Processing and Graphics (NCVPRIPG). Jodhpur, India, 2013: 1⁃4. [6]DAI Wenyuan, XUE Guirong, YANG Qiang, et al. Trans⁃ ferring naive Bayes classifiers for text classification [ C] / / Proceedings of the 22nd National Conference on Artificial Intelligence. Vancourver, British Columbia, Canada, 2007, 1: 540⁃545. [7]LIAO Xuejun, XUE Ya, CARIN L. Logistic regression with an auxiliary data source[C] / / Proceedings of the 22nd In⁃ ternational Conference on Machine Learning. New York, ·316· 智 能 系 统 学 报 第 11 卷
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有