446 工程科学学报,第42卷,第4期 对比分析.其中,LSVM和L2SVM中的超参数 除ADHD数据集外,本文也选取了University of C=0.8;RF中树的个数为50,每棵树的最大深度为 California Irvine(UCI)Machine Learning Repository 5;ELM中隐层节点个数为30.实验运行了十次, 上的MNIST数据集来测试提出的方法的有效性 最终给出了每个数据集下十次的平均结果,如表3 MNIST数据集有10类,我们对其进行下采样并随 所示.表中每个数据集最高的准确度以及最高的 机选取标签为“9”的50个样本作为正样本,其他 g-means值均以黑体加粗的形式给出.对比结果表 类均为负样本构造分类问题.分类结果如表3所 明,本文提出的三个目标的方案在所有ADHD数 示,结果表明本文提出的方法对一般的不平衡数 据集上的表现都优于其他对比方法 据集也具有较好的分类效果 表3不同方法的平均准确度/g-means值 Table 3 Average accuracy/g-means value for different methods Data set L SVM L2SVM B-SVM RF ELM T-SVM KKI 0.635/0.421 0.634/0.515 0.7320.527 0.725/0.530 0.696/0.622 0.753/0.606 NYU 0.545/0.543 0.556/0.542 0.643/0.624 0.6080.610 0.588/0.594 0.703/0.698 Peking-1 0.725/0.683 0.714/0.664 0.8010.677 0.770/0.688 0.677/0.647 0.813/0.711 Peking-2 0.636/0.637 0.665/0.683 0.807/0.776 0.635/0.649 0.564/0.601 0.845/0.851 Peking-joint 0.630/0.615 0.624/0.611 0.742/0.764 0.665/0.686 0.625/0.613 0.751/0.743 MNIST 0.977/0.783 0.978/0.797 0.979/0.800 0.975/0.790 0.9690.00 0.984/0.849 4 结论 [6] Castellanos F X,Margulies D S,Kelly C,et al.Cingulate- precuneus interactions:a new locus of dysfunction in adult 本文提出了一种基于多目标支持向量机的ADHD attention-deficit/hyperactivity disorder.Biol Psychiat,2008, 数据分类方案.该方案使用基于1范数SVM的三 63(3:332 个目标优化模型,分别考虑了正负样本的经验误 [7]Du J Q,Wang L P,Jie B,et al.Network-based classification of 差,从而可以从算法层面有效地处理类不平衡问 ADHD patients using discriminative subnetwork selection and 题.通过求解多目标优化问题,可以得到一组代表 graph kernel PCA.Comput Med Imag Graph,2016,52:82 性的Pareto最优分类器以供决策者进行选择.该 [8] Qureshi M N I,Jo H J,Lee B.ADHD subgroup discrimination 分类方案在ADHD-200数据集上进行了测试并和 with global connectivity features using hierarchical extreme leaming machine:resting-state FMRI study /2017 IEEE 14th 文献中的方法进行了对比分析.实验结果表明,本 International Symposium on Biomedical Imaging (ISBI 2017). 文提出的三个目标SVM分类方案在所有测试数 Melbourne,2017:529 据集上的表现优于1范数SVM,2范数SVM,随机 [9] Miao B,Zhang Y L.A feature selection method for classification 森林、极限学习机和双目标SVM方法 of ADHD I Proceedings of 4th International Conference on Information,Cybernetics and Computational Social Systems 参考文献 (ICCSS).Dalian,2017:21 [1]American Psychiatric Association.Diagnostic and statistical [10]Riaz A,Asad M,Alonso E,et al.Fusion of fMRI and non-imaging manual of mental disorders.BMC Med,2013,17:133 data for ADHD classification.Comput Med Imag Graph,2018,65: [2]Saad J F.Kohn M R.Clarke S,et al.Is the theta/beta EEG marker 115 for ADHD inherently flawed?J Attention Disord,2018,22(9): [11]Chawla N V,Bowyer K W,Hall L O,et al.SMOTE:synthetic 815 minority over-sampling technique.JArtif Intell Res,2002,16:321 [3] Chang C W,Ho CC,Chen J H.ADHD classification by a texture [12]Krawezyk B.Learning from imbalanced data:open challenges and analysis of anatomical brain MRI data.Front Syst Neurosci,2012 future directions.Prog Artif Intell,2016,5(4):221 6:66 [13]He H B,Garcia E A.Learning from imbalanced data./EEE Trans [4]Kuang L D,Lin Q H,Gong X F,et al.Model order effects on ICA Know/Data Eng,2009,21(9):1263 of resting-state complex-valued fMRI data:application to [14]Shao L Z,Xu Y D.Fu D M.Classification of ADHD with bi- schizophrenia.J Neurosci Methods,2018,304:24 objective optimization.J Biomed Inf 2018,84:164 [5]Hojjati S H,Ebrahimzadeh A,Khazaee A,et al.Predicting [15]Bellec P,Chu C,Chouinard-Decorte F,et al.The neuro bureau conversion from MCI to AD using resting-state fMRI,graph ADHD-200 preprocessed repository.Neuroimage,2017,144:275 theoretical approach and SVM.J Neurosci Methods,2017.282:69 [16]Friston K J.Functional and effective connectivity:a review.Brain对比分析. 其中 , L1SVM 和 L2SVM 中的超参数 C=0.8;RF 中树的个数为 50,每棵树的最大深度为 5;ELM 中隐层节点个数为 30. 实验运行了十次, 最终给出了每个数据集下十次的平均结果,如表 3 所示. 表中每个数据集最高的准确度以及最高的 g-means 值均以黑体加粗的形式给出. 对比结果表 明,本文提出的三个目标的方案在所有 ADHD 数 据集上的表现都优于其他对比方法. 除 ADHD 数据集外,本文也选取了 University of California Irvine( UCI) Machine Learning Repository 上的 MNIST 数据集来测试提出的方法的有效性. MNIST 数据集有 10 类,我们对其进行下采样并随 机选取标签为“9”的 50 个样本作为正样本,其他 类均为负样本构造分类问题. 分类结果如表 3 所 示,结果表明本文提出的方法对一般的不平衡数 据集也具有较好的分类效果. 4 结论 本文提出了一种基于多目标支持向量机的ADHD 数据分类方案. 该方案使用基于 1 范数 SVM 的三 个目标优化模型,分别考虑了正负样本的经验误 差,从而可以从算法层面有效地处理类不平衡问 题. 通过求解多目标优化问题,可以得到一组代表 性的 Pareto 最优分类器以供决策者进行选择. 该 分类方案在 ADHD-200 数据集上进行了测试并和 文献中的方法进行了对比分析. 实验结果表明,本 文提出的三个目标 SVM 分类方案在所有测试数 据集上的表现优于 1 范数 SVM,2 范数 SVM,随机 森林、极限学习机和双目标 SVM 方法. 参 考 文 献 American Psychiatric Association. Diagnostic and statistical manual of mental disorders. BMC Med, 2013, 17: 133 [1] Saad J F, Kohn M R, Clarke S, et al. Is the theta/beta EEG marker for ADHD inherently flawed? J Attention Disord, 2018, 22(9): 815 [2] Chang C W, Ho C C, Chen J H. ADHD classification by a texture analysis of anatomical brain MRI data. Front Syst Neurosci, 2012, 6: 66 [3] Kuang L D, Lin Q H, Gong X F, et al. Model order effects on ICA of resting-state complex-valued fMRI data: application to schizophrenia. J Neurosci Methods, 2018, 304: 24 [4] Hojjati S H, Ebrahimzadeh A, Khazaee A, et al. Predicting conversion from MCI to AD using resting-state fMRI, graph theoretical approach and SVM. J Neurosci Methods, 2017, 282: 69 [5] Castellanos F X, Margulies D S, Kelly C, et al. Cingulateprecuneus interactions: a new locus of dysfunction in adult attention-deficit/hyperactivity disorder. Biol Psychiat, 2008, 63(3): 332 [6] Du J Q, Wang L P, Jie B, et al. Network-based classification of ADHD patients using discriminative subnetwork selection and graph kernel PCA. Comput Med Imag Graph, 2016, 52: 82 [7] Qureshi M N I, Jo H J, Lee B. ADHD subgroup discrimination with global connectivity features using hierarchical extreme learning machine: resting-state FMRI study // 2017 IEEE 14th International Symposium on Biomedical Imaging (ISBI 2017). Melbourne, 2017: 529 [8] Miao B, Zhang Y L. A feature selection method for classification of ADHD // Proceedings of 4th International Conference on Information, Cybernetics and Computational Social Systems (ICCSS). Dalian, 2017: 21 [9] Riaz A, Asad M, Alonso E, et al. Fusion of fMRI and non-imaging data for ADHD classification. Comput Med Imag Graph, 2018, 65: 115 [10] Chawla N V, Bowyer K W, Hall L O, et al. SMOTE: synthetic minority over-sampling technique. J Artif Intell Res, 2002, 16: 321 [11] Krawczyk B. Learning from imbalanced data: open challenges and future directions. Prog Artif Intell, 2016, 5(4): 221 [12] He H B, Garcia E A. Learning from imbalanced data. IEEE Trans Knowl Data Eng, 2009, 21(9): 1263 [13] Shao L Z, Xu Y D, Fu D M. Classification of ADHD with biobjective optimization. J Biomed Inf, 2018, 84: 164 [14] Bellec P, Chu C, Chouinard-Decorte F, et al. The neuro bureau ADHD-200 preprocessed repository. Neuroimage, 2017, 144: 275 [15] [16] Friston K J. Functional and effective connectivity: a review. Brain 表 3 不同方法的平均准确度/g-means 值 Table 3 Average accuracy/g-means value for different methods Data set L1SVM L2SVM B-SVM RF ELM T-SVM KKI 0.635/0.421 0.634/0.515 0.732/0.527 0.725/0.530 0.696/0.622 0.753/0.606 NYU 0.545/0.543 0.556/0.542 0.643/0.624 0.608/0.610 0.588/0.594 0.703/0.698 Peking-1 0.725/0.683 0.714/0.664 0.801/0.677 0.770/0.688 0.677/0.647 0.813/0.711 Peking-2 0.636/0.637 0.665/0.683 0.807/0.776 0.635/0.649 0.564/0.601 0.845/0.851 Peking-joint 0.630/0.615 0.624/0.611 0.742/0.764 0.665/0.686 0.625/0.613 0.751/0.743 MNIST 0.977/0.783 0.978/0.797 0.979/0.800 0.975/0.790 0.969/0.00 0.984/0.849 · 446 · 工程科学学报,第 42 卷,第 4 期