正在加载图片...
第2期 齐小刚,等:基于MapReduce的并行异常检测算法 ·229· ×10 techniques[M].San Francisco:Morgan Kaufmann Publish- 2.0 --LOF ers Inc,2006:1-18. ◇-MR-DLOF 1.5 [2]AGGARWAL CC.Outlier analysis[M].New York: Springer,2013:75-99 Ξ1.0 [3]CHEN Feng,DENG Pan,WAN Jiafu,et al.Data mining for the internet of things:literature review and challenges 0.5 [J].International journal of distributed sensor networks, 2015.2015:431047. 0 00年 ×10 0.51.01.52.02.53.0 [4]吴镜锋,金炜东,唐鹏.数据异常的监测技术综述几.计 数据集规模 算机科学,2017,44(S2):24-28. 图5算法效率比较 WU Jingfeng,JIN Weidong,TANG Peng.Survey on mon- Fig.5 Efficiency comparison of algorithm itoring techniques for data abnormalities[J].Computer sci- 3.3算法的可扩展性验证 ence,2017,44(S2y:24-28 [5]STEINWART I,HUSH D,SCOVEL C.A classification 为了验证MR-DLOF算法的可扩展性,本文 framework for anomaly detection[J].The journal of ma- 通过扩大数据规模来比较MR-DLOF在不同计算 chine learning research,2005,6(1):211-232 节点下的执行效率,由图6可以看出,在相同数据 [6]邓红莉,杨韬.一种基于深度学习的异常检测方法).信 集规模下,当集群计算节点增加时,算法的执行 息通信,2015(3):3-4 效率提高。因此当数据集增大时,MR-DLOF算 DENG Hongli,YANG Tao.An anomaly detection method 法具有可扩展性,可通过扩充Hadoop集群中计算 based on deep learning[J].Information and communica- 节点的方法来提高算法的执行效率。 tions,.2015(3):3-4 5.0*10 [7]ZHAO Xuanqiang,WANG Guoying,LI Zhixing.Unsuper- ×-2个计算节点 vised network anomaly detection based on abnormality 4.5 0-3个计算节点 weights and subspace clustering[Cl//Proceedings of the 6th 4.0 -◇.4个计算节点 0 International Conference on Information Science and Technology.Dalian,China,2016:482-486. 2.5 0 [8]左进,陈泽茂.基于改进K均值聚类的异常检测算法U 2.0 .01 计算机科学,2016,43(8):258-261. 1.5.-- ZUO Jin,CHEN Zemao.Anomaly detection algorithm 1.0 based on improved K-means clustering[J].Computer sci- 数据集规模 ence,2016,43(8):258-261. 图6不同计算节点数量下的执行效率 [9]邹云峰,张昕,宋世渊,等,基于局部密度的快速离群点 Fig.6 Execution efficiency under different calculating 检测算法.计算机应用,2017,37(10):2932-2937, node numbers ZOU Yunfeng,ZHANG Xin,SONG Shiyuan,et al.Fast 4结束语 outlier detection algorithm based on local density[J]. Journal of computer applications,2017,37(10):2932- 本文通过分析LOF算法的不足,设计了一种 2937. 基于MapReduce和LOF算法的并行异常检测算 [10]BREUNIG MM,KRIEGEL H P,NG R T,et al.LOF: 法。该算法修正了k邻近距离的概念,从而避免 identifying density-based local outliers[C]//Proceedings 某些点的可达距离为零、局部可达密度为无穷大 of 2000 ACM SIGMOD International Conference on 的情况,以提高算法的有效性,同时,为了减少计 Management of Data.Dallas,Texas,USA,2000:93-104. [11]BHATT V.SHARMA K G,RAM A.An enhanced ap- 算量,将分块、利用MapReduce框架思想并行化 处理各个数据块中的数据点,大大提高了算法的 proach for LOF in data mining[C]//Proceedings of 2013 International Conference on Green High Performance 执行效率。最后,通过真实数据集验证算法的有 Computing.Nagercoil,India,2013:1-3. 效性、高效性和可扩展性。 [12]MIAO Dandan,QIN Xiaowei,WANG Weidong.Anom- 参考文献: alous cell detection with kernel density-based local out- lier factor[J].China communications,2015,12(9):64-75. [1]HAN Jiawei,KAMBER M.Data mining:concepts and [13]TANG Bo,HE Haibo.A local density-based approach for       ᪜ᢚ䯲㻰Ὅ     䓼㵸ᬢ䬠V /2) 05'/2) î î ప 5 ッ∁ᩴ⢳℀䒯 Fig. 5 Efficiency comparison of algorithm 3.3 ッ∁⮰छផᆁᕓ侸䃭 ˝˿ᰍ᝼ MR-DLOF ኪกᄉԺੰࡘবὋఴ஠ ᤯᣾ੰܷஜ૵᜺ഴ౎ඊᣖ MR-DLOF ڙʿՎ᝟ኪ ᓫཁʽᄉ੯ᛠ஌ညὋၿڎ 6 Ժ̾ᄹѢὋڙᄰՎஜ૵ ᬶ᜺ഴʽὋॆᬶᏅ᝟ኪᓫཁܘҪௐὋኪกᄉ੯ᛠ ஌ညଡᰳnjځॆ൤ஜ૵ᬶܘܷௐὋMR-DLOF ኪ กХథԺੰࡘবὋԺ᤯᣾ੰЌ Hadoop ᬶᏅ˖᝟ኪ ᓫཁᄉழก౎ଡᰳኪกᄉ੯ᛠ஌ညnj  ᪜ᢚ䯲㻰Ὅ          䓼㵸ᬢ䬠V ͖䃍ッ㞮◥ ͖䃍ッ㞮◥ ͖䃍ッ㞮◥ î î ప 6 ̹स䃍ッ㞮◥᪜䛻̷⮰ន㵸ᩴ⢳ Fig. 6 Execution efficiency under different calculating node numbers 4 ፆోឥ ఴ஠᤯᣾ѫౡ LOF ኪกᄉʿᡛὋ᝹᝟˿ʶሗ ۲̅ MapReduce ֖ LOF ኪกᄉࣲᛠऩ࣡฽೜ኪ กnjឞኪกνൣ˿ k-᥵ᤂᡯሎᄉഏএὋ̯ᏪᥗБ ౼̎ཁᄉԺ᣹ᡯሎ˝ᭅNjࡌᦉԺ᣹ࠚऎ˝௃ቃܷ ᄉৰхὋ̾ଡᰳኪกᄉథ஌বὋՎௐὋ˝˿ђ࠵᝟ ኪ᧙Ὃ࠱ѫڰNjѽၸ MapReduce ಳ౵ধਆࣲᛠӐ ܪူՉ˓ஜ૵ڰ˖ᄉஜ૵ཁὋܷܷଡᰳ˿ኪกᄉ ੯ᛠ஌ညnjణՐὋ᤯᣾ᄽࠃஜ૵ᬶᰍ᝼ኪกᄉథ ஌বNjᰳ஌ব֖Ժੰࡘবnj ࣮㔯᪳⡚喝 >1@ HAN Jiawei, KAMBER M. Data mining: concepts and techniques[M]. San Francisco: Morgan Kaufmann Publish￾ers Inc., 2006: 1–18. AGGARWAL C C. Outlier analysis[M]. New York: Springer, 2013: 75–99. >2@ CHEN Feng, DENG Pan, WAN Jiafu, et al. Data mining for the internet of things: literature review and challenges [J]. International journal of distributed sensor networks, 2015, 2015: 431047. >3@ ի᪪ᩣ, ᧚༶ˋ, נ .᳀ஜ૵ऩ࣡ᄉᄢ฽੾శ፫ᤗ[J]. ᝟ ኪ఺መߥ ,2017, 44(S2): 24–28. WU Jingfeng, JIN Weidong, TANG Peng. Survey on mon￾itoring techniques for data abnormalities[J]. Computer sci￾ence, 2017, 44(S2): 24–28. >4@ STEINWART I, HUSH D, SCOVEL C. A classification framework for anomaly detection[J]. The journal of ma￾chine learning research, 2005, 6(1): 211–232. >5@ ᥞጙᕹ, ౏ᮀ. ʶሗ۲̅ຆऎߥ˷ᄉऩ࣡฽೜ழก[J]. ζ ো᤯ζ, 2015(3): 3–4. DENG Hongli, YANG Tao. An anomaly detection method based on deep learning[J]. Information and communica￾tions, 2015(3): 3–4. >6@ ZHAO Xuanqiang, WANG Guoying, LI Zhixing. Unsuper￾vised network anomaly detection based on abnormality weights and subspace clustering[C]//Proceedings of the 6th International Conference on Information Science and Technology. Dalian, China, 2016: 482–486. >7@ ࢺᤈ, ᬇบᔳ. ۲̅இᤈ K ڨϘᐐዜᄉऩ࣡฽೜ኪก[J]. ᝟ኪ఺መߥ ,2016, 43(8): 258–261. ZUO Jin, CHEN Zemao. Anomaly detection algorithm based on improved K-means clustering[J]. Computer sci￾ence, 2016, 43(8): 258–261. >8@ ᥳ̇ࢎ ,ष௝, ߷ˆຍ, ኍ. ۲̅ࡌᦉࠚऎᄉঋᤳሎᏅཁ ೜฽ኪก[J]. ᝟ኪ఺ःၸ, 2017, 37(10): 2932–2937. ZOU Yunfeng, ZHANG Xin, SONG Shiyuan, et al. Fast outlier detection algorithm based on local density[J]. Journal of computer applications, 2017, 37(10): 2932– 2937. >9@ BREUNIG M M, KRIEGEL H P, NG R T, et al. LOF: identifying density-based local outliers[C]//Proceedings of 2000 ACM SIGMOD International Conference on Management of Data. Dallas, Texas, USA, 2000: 93–104. >10@ BHATT V, SHARMA K G, RAM A. An enhanced ap￾proach for LOF in data mining[C]//Proceedings of 2013 International Conference on Green High Performance Computing. Nagercoil, India, 2013: 1–3. >11@ MIAO Dandan, QIN Xiaowei, WANG Weidong. Anom￾alous cell detection with kernel density-based local out￾lier factor[J]. China communications, 2015, 12(9): 64–75. >12@ >13@ TANG Bo, HE Haibo. A local density-based approach for ኃ 2 య ᴎ࠴ѷὋኍὙ۲̅ MapReduce ᄉࣲᛠऩ࣡฽೜ኪก g229g
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有