正在加载图片...
65) Briefly describe five techniques(or algorithms)that are used for classification modeling nswer Decision tree analysis. Decision tree analysis(a machine-learning technique)is arguably the most popular classification technique in the data mining arena. Statistical analysis. Statistical techniques were the primary classification algorithm for many years until the emergence of machine-learning techniques. Statistical classification techniques include logistic regression and discriminant analysis Neural networks. These are among the most popular machine-learning techniques that can be used for classification-type problems Bayesian classifiers. This approach probable category es to recognize commonalities in Case-based reasoning. This approach uses historical cas rder to assign a new case into the most ses probability theory to build classification models based on the past occurrences that are capable of placing a new instance into a most probable class(or category) Genetic algorithms. This approach uses the analogy of natural evolution to build directed search-based mechanisms to classify data samples Rough sets. This method takes into account the partial membership of class labels to predefined categories in building models(collection of rules) for classification problems Diff: 2 Page Ref: 219-220 66)Describe cluster analysis and some of its applications Answer: Cluster analysis is an exploratory data analysis tool for solving classification problems The objective is to sort cases(e.g, people, things, events)into groups, or clusters, so that the different clusters. Cluster analysis is an essential data mining method for classifying items, s of degree of association is strong among members of the same cluster and weak among members of events, or concepts into common groupings called clusters. The method is commonly used in biology, medicine, genetics, social network analysis, anthropology, archaeology, astronomy, character recognition, and even in Mis development. As data mining has increased in popularity. the underlying techniques have been applied to business, especially to marketing. Cluster analysis has been used extensively for fraud detection(both credit card and e-commerce fraud and market segmentation of customers in contemporary CRM systems Diff: 2 Page Ref: 225-226 67) In the data mining in Hollywood case study, how successful were the models in predicting the success or failure of a hollywood movie? Answer: The researchers claim that these prediction results are better than any reported in the published literature for this problem domain. Fusion classification methods attained up to 56.07%accuracy in correctly classifying movies and 90.75%accuracy in classifying movies within one category of their actual category. The SVm classification method attained up to 55. 49%accuracy in correctly classifying movies and 85.55% accuracy in classifying movies within one category of their actual category Diff:3 Page Ref: 233 Copyright C 2018 Pearson Education, Inc11 Copyright © 2018 Pearson Education, Inc. 65) Briefly describe five techniques (or algorithms) that are used for classification modeling. Answer: • Decision tree analysis. Decision tree analysis (a machine-learning technique) is arguably the most popular classification technique in the data mining arena. • Statistical analysis. Statistical techniques were the primary classification algorithm for many years until the emergence of machine-learning techniques. Statistical classification techniques include logistic regression and discriminant analysis. • Neural networks. These are among the most popular machine-learning techniques that can be used for classification-type problems. • Case-based reasoning. This approach uses historical cases to recognize commonalities in order to assign a new case into the most probable category. • Bayesian classifiers. This approach uses probability theory to build classification models based on the past occurrences that are capable of placing a new instance into a most probable class (or category). • Genetic algorithms. This approach uses the analogy of natural evolution to build directed￾search-based mechanisms to classify data samples. • Rough sets. This method takes into account the partial membership of class labels to predefined categories in building models (collection of rules) for classification problems. Diff: 2 Page Ref: 219-220 66) Describe cluster analysis and some of its applications. Answer: Cluster analysis is an exploratory data analysis tool for solving classification problems. The objective is to sort cases (e.g., people, things, events) into groups, or clusters, so that the degree of association is strong among members of the same cluster and weak among members of different clusters. Cluster analysis is an essential data mining method for classifying items, events, or concepts into common groupings called clusters. The method is commonly used in biology, medicine, genetics, social network analysis, anthropology, archaeology, astronomy, character recognition, and even in MIS development. As data mining has increased in popularity, the underlying techniques have been applied to business, especially to marketing. Cluster analysis has been used extensively for fraud detection (both credit card and e-commerce fraud) and market segmentation of customers in contemporary CRM systems. Diff: 2 Page Ref: 225-226 67) In the data mining in Hollywood case study, how successful were the models in predicting the success or failure of a Hollywood movie? Answer: The researchers claim that these prediction results are better than any reported in the published literature for this problem domain. Fusion classification methods attained up to 56.07% accuracy in correctly classifying movies and 90.75% accuracy in classifying movies within one category of their actual category. The SVM classification method attained up to 55.49% accuracy in correctly classifying movies and 85.55% accuracy in classifying movies within one category of their actual category. Diff: 3 Page Ref: 233
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有