网络新媒体技术 2012 年图 7 随机森林模型可视化的发

点击下载：《电子商务 E-business》参考资料（大数据）：面向大数据的海云数据系统关键技术研究

正在加载图片...

网络新媒体技术 2012年的发展战略产生深远的影响。本文介绍了大数据的一些基本概念、特征和面 YES 临的科学问题,总结了中国科学院战略性先导科技专项课题“海云数据系统关键技术研究与系统研制”的一些前期工作,对未来的研究方向进行了展望。 YES 参考文献 u] Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung. The Google file system [C]//Pro- ceedings of the 19th ACM Symposium on Operating Systems Principles, ACM, Bolton Landing, NY, 2003, 20-43 2] Jeffrey Dean, Sanjay Ghemawat. MapReduce: simplified data processing on large clusters C]//OSDI04: Sixth Symposium on Operating System Design and Implementation,San Francisco, CA, 2004, 137-150 3] Clifford Lynch. Big data: How do your data grow? D]. Nature, 2008, 455(7209): 28-29 4]http://www.sciencemag,org/site/special/data/ 5] James Manyika, Michael Chui, Brad Brown, etc. Big data: The next frontier for innova- ion,competition, and productivity. 2011 6]http://www-01.ibmcom/software/data/bigdata/. J Joshua Zhexue Huang, Michael K. Ng, Hongqiang Rong, etc. Automated variable weigh- ting in k- means type clustering D. IEEE Transactions on pattern analysis and machine图7随机森林模型可视化 Intelligence,2005,27(5):657-668 8] Liping Jing, Michael K. Ng, Joshua Zhexue Huang. An entropy weighting k-means algorithm for subspace clustering of high dimensional sparse data []. IEEE Transactions Knowledge and Data Engineering, 2007, 19(8):1026-1041 9]http://lucene.apacheorg/mahout/ o]王珊,王会举,覃雄派,周烜.架构大数据:挑战、现状与展望.计算机学报,2011,34(10):1741-1752 ]覃雄派,王会举,杜小勇,王珊.大数据分析- RDBMS与 MapReduce的竞争与共生D].软件学报,2012,23(1):32 [2 Leo Breiman. Random forests [). Machine learning, 2001, 45(1): 5-32 03] Baoxun Xu, Joshua Zhexue Huang, Graham Willams, etc. Classifying very high-dimensional data with random forests built from small subspaces [], International Journal of Data Warehouse and Mining, 2012,8(2): 45-62 [14] Xiaojun Chen, Xiaofei Xu, Yunming Ye, etc. TW-k-means: automated two-level variable weighting clustering algorithm for multi-viewdataD].ieeetrAnsactionsonKnowledgeandDataEngineering,http://doi.ieeecomputersociety.org/10.1109 TKDE2011.262 [5] Xiaojun Chen, Yunming Ye, Xiaofei Xu etc. A feature group weighting method for subspace clustering of high-dimensional data D]. Pattern Recognition, 2012, 45(1):434-446 16] Bingguo Li, Xiaojun Chen, Mark Junjie Li, etc. Scalable random forests for massive data [C. PAKDD,2012. 作者简介黄晢学,男,博土,中科院深圳先进技术硏究院研究员,主要从事数据挖掘与机器学习方面的硏究曹付元,男,博士,中科院深圳先进技术研究院博士后,主要从事数据挖掘与机器学习方面的研究。李俊杰,男,博土,中科院深圳先进技术硏究院助理硏究员,主要从事数据挖掘与机器学习方面的硏究。陈小军,男,博土,中科院深圳先进技术研究院助理研究员,主要从事数据挖掘与机器学习方面的硏究 o1994-2013CHinaAcademicJournalElectronicpUblishingHouse.Allrightsreservedhttp://www.cnki.net网络新媒体技术 2012 年图 7 随机森林模型可视化的发展战略产生深远的影响。本文介绍了大数据的一些基本概念、特征和面临的科学问题，总结了中国科学院战略性先导科技专项课题“海云数据系统关键技术研究与系统研制”的一些前期工作，对未来的研究方向进行了展望。参考文献［1］Sanjay Ghemawat，Howard Gobioff，Shun － Tak Leung. The Google file system［C］/ /Proceedings of the 19th ACM Symposium on Operating Systems Principles，ACM，Bolton Landing，NY，2003，20 － 43 ［2］Jeffrey Dean，Sanjay Ghemawat. MapReduce: simplified data processing on large clusters ［C］/ /OSDI＇04: Sixth Symposium on Operating System Design and Implementation，San Francisco，CA，2004，137 － 150 ［3］Clifford Lynch. Big data: How do your data grow? ［J］. Nature，2008，455( 7209) : 28 －29 ［4］http: / /www. sciencemag. org /site /special /data /. ［5］James Manyika，Michael Chui，Brad Brown，etc. Big data: The next frontier for innovation，competition，and productivity. 2011. ［6］http: / /www － 01. ibm. com/software /data /bigdata /. ［7］Joshua Zhexue Huang，Michael K. Ng，Hongqiang Rong，etc. Automated variable weighting in k － means type clustering［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence，2005，27( 5) : 657 － 668 ［8］Liping Jing，Michael K. Ng，Joshua Zhexue Huang. An entropy weighting k － means algorithm for subspace clustering of high － dimensional sparse data［J］. IEEE Transactions Knowledge and Data Engineering，2007，19( 8) : 1026 － 1041 ［9］http: / /lucene. apache. org /mahout /. ［10］王珊，王会举，覃雄派，周烜 . 架构大数据: 挑战、现状与展望［J］. 计算机学报，2011，34( 10) : 1741 － 1752 ［11］覃雄派，王会举，杜小勇，王珊 . 大数据分析－ RDBMS 与 MapReduce 的竞争与共生［J］. 软件学报，2012，23( 1) : 32 － 45 ［12］Leo Breiman. Random forests［J］. Machine learning，2001，45( 1) : 5 – 32 ［13］Baoxun Xu，Joshua Zhexue Huang，Graham Willams，etc. Classifying very high － dimensional data with random forests built from small subspaces［J］，International Journal of Data Warehouse and Mining，2012，8( 2) : 45 － 62 ［14］Xiaojun Chen，Xiaofei Xu，Yunming Ye，etc. TW － k － means: automated two － level variable weighting clustering algorithm for multi － view data［J］. IEEE Transactions on Knowledge and Data Engineering，http: / /doi. ieeecomputersociety. org /10. 1109 / TKDE. 2011. 262 ［15］Xiaojun Chen，Yunming Ye，Xiaofei Xu etc. A feature group weighting method for subspace clustering of high － dimensional data ［J］. Pattern Recognition，2012，45( 1) : 434 － 446 ［16］Bingguo Li，Xiaojun Chen，Mark Junjie Li，etc. Scalable random forests for massive data［C］. PAKDD，2012. 作者简介黄哲学，男，博士，中科院深圳先进技术研究院研究员，主要从事数据挖掘与机器学习方面的研究。曹付元，男，博士，中科院深圳先进技术研究院博士后，主要从事数据挖掘与机器学习方面的研究。李俊杰，男，博士，中科院深圳先进技术研究院助理研究员，主要从事数据挖掘与机器学习方面的研究。陈小军，男，博士，中科院深圳先进技术研究院助理研究员，主要从事数据挖掘与机器学习方面的研究。 26

<<向上翻页

点击下载：《电子商务 E-business》参考资料（大数据）：面向大数据的海云数据系统关键技术研究