当前位置:高等教育资讯网  >  中国高校课件下载中心  >  大学文库  >  浏览文档

北京航空航天大学:《数据挖掘——概念和技术(Data Mining - Concepts and Techniques)》课程教学资源(PPT课件讲稿)Chapter 01 Introduction

资源类别:文库,文档格式:PPT,文档页数:71,文件大小:616.5KB,团购合买
◼ Why Data Mining? ◼ What Is Data Mining? ◼ A Multi-Dimensional View of Data Mining ◼ What Kinds of Data Can Be Mined? ◼ What Kinds of Patterns Can Be Mined? ◼ What Kinds of Technologies Are Used? ◼ What Kinds of Applications Are Targeted? ◼ Major Issues in Data Mining ◼ A Brief History of Data Mining and Data Mining Society ◼ Summary
点击下载完整版文档(PPT)

Data Mining: Concepts and Techniques Chapter1一 Richeng Zhang Office: New Main Building g521 Email:Zhangrc@act.buaa.edu.cn This slide is made based on the slides provided by jiawei Han Micheline Kamber and Jian pei. 2012 Han Kamber pei

1 Data Mining: Concepts and Techniques — Chapter 1 — Richong Zhang Office: New Main Building, G521 Email:zhangrc@act.buaa.edu.cn This slide is made based on the slides provided by Jiawei Han, Micheline Kamber, and Jian Pei. © 2012 Han, Kamber & Pei

Chapter 1, Introduction Why Data Mining? What Is Data Mining? A Multi-Dimensional View of Data Mining What Kinds of data Can be mined? What kinds of patterns can be mined? What Kinds of Technologies Are Used? What Kinds of Applications Are Targeted? Major issues in Data Mining A Brief History of data Mining and Data Mining Societ Summary

10 Chapter 1. Introduction ◼ Why Data Mining? ◼ What Is Data Mining? ◼ A Multi-Dimensional View of Data Mining ◼ What Kinds of Data Can Be Mined? ◼ What Kinds of Patterns Can Be Mined? ◼ What Kinds of Technologies Are Used? ◼ What Kinds of Applications Are Targeted? ◼ Major Issues in Data Mining ◼ A Brief History of Data Mining and Data Mining Society ◼ Summary

Why Data Mining? The Explosive Growth of Data: from terabytes to petabytes Data collection and data availability Automated data collection tools, database systems, Web computerized society Major sources of abundant data Business: Web, e-commerce transactions, stocks, Science: Remote sensing bioinformatics, scientific simulation Society and everyone: news, digital cameras, YouTube We are drowning in data, but starving for knowledge Necessity is the mother of invention"-Data mining-Automated analysis of massive data sets

11 Why Data Mining? ◼ The Explosive Growth of Data: from terabytes to petabytes ◼ Data collection and data availability ◼ Automated data collection tools, database systems, Web, computerized society ◼ Major sources of abundant data ◼ Business: Web, e-commerce, transactions, stocks, … ◼ Science: Remote sensing, bioinformatics, scientific simulation, … ◼ Society and everyone: news, digital cameras, YouTube ◼ We are drowning in data, but starving for knowledge! ◼ “Necessity is the mother of invention”—Data mining—Automated analysis of massive data sets

Why Data Mining Credit ratings/targeted marketing Given a database of 100,000 names which persons are the least likely to default on their credit cards? Identify likely responders to sales promotions Fraud detection Which types of transactions are likely to be fraudulent, given the demographics and transactional history of a particular customer? Customer relationship management Which of my customers are likely to be the most loyal, and which are most likely to leave for a competitor?: Data Mining helps extract such information

Why Data Mining ◼ Credit ratings/targeted marketing: ◼ Given a database of 100,000 names, which persons are the least likely to default on their credit cards? ◼ Identify likely responders to sales promotions ◼ Fraud detection ◼ Which types of transactions are likely to be fraudulent, given the demographics and transactional history of a particular customer? ◼ Customer relationship management: ◼ Which of my customers are likely to be the most loyal, and which are most likely to leave for a competitor? : Data Mining helps extract such information

Data mining Process of semi-automatically analyzing large databases to find patterns that are valid: hold on new data with some certainty a novel non-obvious to the system useful: should be possible to act on the item understandable: humans should be able to interpret the pattern a also known as Knowledge discovery in Databases(KDD)

Data mining ◼ Process of semi-automatically analyzing large databases to find patterns that are: ◼ valid: hold on new data with some certainity ◼ novel: non-obvious to the system ◼ useful: should be possible to act on the item ◼ understandable: humans should be able to interpret the pattern ◼ Also known as Knowledge Discovery in Databases (KDD)

Chapter 1, Introduction Why Data Mining? What Is Data Mining? A Multi-Dimensional View of Data Mining What Kinds of data Can be mined? What kinds of patterns can be mined? What Kinds of Technologies Are Used? What Kinds of Applications Are Targeted? Major issues in Data Mining A Brief History of data Mining and Data Mining Societ Summary 14

14 Chapter 1. Introduction ◼ Why Data Mining? ◼ What Is Data Mining? ◼ A Multi-Dimensional View of Data Mining ◼ What Kinds of Data Can Be Mined? ◼ What Kinds of Patterns Can Be Mined? ◼ What Kinds of Technologies Are Used? ◼ What Kinds of Applications Are Targeted? ◼ Major Issues in Data Mining ◼ A Brief History of Data Mining and Data Mining Society ◼ Summary

What Is Data Mining? Data mining( knowledge discovery from data Extraction of interesting (non-trivial, implicit, previously unknown and potentially useful) patterns or knowledge from huge amount of data Alternative names Knowledge discovery(mining) in databases(KDD), knowledge extraction, data/pattern analysis data archeology data dredging, information harvesting business intelligence, etc Watch out: Is everything"data mining"? Simple search and query processing (Deductive)expert systems 迹 15

15 What Is Data Mining? ◼ Data mining (knowledge discovery from data) ◼ Extraction of interesting (non-trivial, implicit, previously unknown and potentially useful) patterns or knowledge from huge amount of data ◼ Alternative names ◼ Knowledge discovery (mining) in databases (KDD), knowledge extraction, data/pattern analysis, data archeology, data dredging, information harvesting, business intelligence, etc. ◼ Watch out: Is everything “data mining”? ◼ Simple search and query processing ◼ (Deductive) expert systems

What is(not)Data Mining? What is not data What is data mining? Mining Look up phone Certain names are more number in phone prevalent in certain US directory locations(O'Brien, ORurke O'Reilly . in Boston area) Query a Web Group together similar search engine for documents returned by information search engine according to about amazon their context (e.g. Amazon rainforest, Amazon. com,)

What is (not) Data Mining? What is Data Mining? – Certain names are more prevalent in certain US locations (O’Brien, O’Rurke, O’Reilly… in Boston area) – Group together similar documents returned by search engine according to their context (e.g. Amazon rainforest, Amazon.com,) What is not Data Mining? – Look up phone number in phone directory – Query a Web search engine for information about “Amazon

Applications Banking: loan/credit card approval predict good customers based on old customers a Customer relationship management identify those who are likely to leave for a competitor. targeted marketing identify likely responders to promotions fraud detection telecommunications financial transactions from an online stream of event identify fraudulent events Manufacturing and production automatically adjust knobs when process parameter changes

Applications ◼ Banking: loan/credit card approval ◼ predict good customers based on old customers ◼ Customer relationship management: ◼ identify those who are likely to leave for a competitor. ◼ Targeted marketing: ◼ identify likely responders to promotions ◼ Fraud detection: telecommunications, financial transactions ◼ from an online stream of event identify fraudulent events ◼ Manufacturing and production: ◼ automatically adjust knobs when process parameter changes

Applications(continued) Medicine disease outcome, effectiveness of treatments analyze patient disease history: find relationship between di seases Molecular/Pharmaceutical: identify new drugs a Scientific data analysis identify new galaxies by searching for sub clusters a Web site/store design and promotion find affinity of visitor to pages and modify layout

Applications (continued) ◼ Medicine: disease outcome, effectiveness of treatments ◼ analyze patient disease history: find relationship between diseases ◼ Molecular/Pharmaceutical: identify new drugs ◼ Scientific data analysis: ◼ identify new galaxies by searching for sub clusters ◼ Web site/store design and promotion: ◼ find affinity of visitor to pages and modify layout

点击下载完整版文档(PPT)VIP每日下载上限内不扣除下载券和下载次数;
按次数下载不扣除下载券;
24小时内重复下载只扣除一次;
顺序:VIP每日次数-->可用次数-->下载券;
共71页,可试读20页,点击继续阅读 ↓↓
相关文档

关于我们|帮助中心|下载说明|相关软件|意见反馈|联系我们

Copyright © 2008-现在 cucdc.com 高等教育资讯网 版权所有