当前位置:高等教育资讯网  >  中国高校课件下载中心  >  大学文库  >  浏览文档

同济大学:《大数据分析与数据挖掘 Big Data Analysis and Mining》课程教学资源(PPT课件讲稿)关联规则 Association Rule

资源类别:文库,文档格式:PPTX,文档页数:53,文件大小:392.64KB,团购合买
点击下载完整版文档(PPTX)

Big Data Analysis and Mining Association Rule Qinpei Zhao赵钦佩 qinpeizhao@tongji.edu.cn 2015 Fall 2021/1/27

2021/1/27 1 Big Data Analysis and Mining Qinpei Zhao 赵钦佩 qinpeizhao@tongji.edu.cn 2015 Fall Association Rule

Frequent Pattern Analysis Frequent patten a pattern(a set of itemS,S subsequences, substructures, etc. )that occur frequently in a data set a First proposed by Agrawal, Imielinski, and Swami in the context of frequent itemsets and assocIation rule mining a Motivation Finding inherent regularities in data o What products were often purchased together? 口 Beer and diapers? What are the subsequent purchases after buying a PC?

Frequent Pattern Analysis 2 ◼ Frequent pattern: a pattern (a set of items, subsequences, substructures, etc.) that occurs frequently in a data set ◼ First proposed by Agrawal, Imielinski, and Swami in the context of frequent itemsets and association rule mining. ◼ Motivation: Finding inherent regularities in data ◆ What products were often purchased together?  Beer and diapers? ◆ What are the subsequent purchases after buying a PC?

Association Rule Discovery a Supermarket shelf management- Market-basket model ■Goal o Identify items that are bought together by sufficiently many customers ■ Approach e Process the sales data collected with barcode scanners to find dependencies among items ■ a classic ru|e If someone buys diaper and milk, then he/she is likely to buy beer Don't be surprised if you find six-packs next to diapers!

Association Rule Discovery 3 ◼ Supermarket shelf management – Market-basket model ◼ Goal: ◆ Identify items that are bought together by sufficiently many customers ◼ Approach: ◆ Process the sales data collected with barcode scanners to find dependencies among items ◼ A classic rule: ◆ If someone buys diaper and milk, then he/she is likely to buy beer ◆ Don’t be surprised if you find six-packs next to diapers!

Applications-(1) a Items= products Baskets = sets of products someone bought in one trip to the store a Real market baskets: Chain stores keep TBs of data about what customers buy together o Tells how typical customers navigate stores, let them position tempting items Suggests tie-in tricks", e.g., run sale on diapers and raise the price of beer Need the rule to occur frequently a Amazon's people who bought X also bought Y

Applications – (1) 4 ◼ Items = products; Baskets = sets of products someone bought in one trip to the store ◼ Real market baskets: Chain stores keep TBs of data about what customers buy together ◆ Tells how typical customers navigate stores, let them position tempting items ◆ Suggests tie-in “tricks”, e.g., run sale on diapers and raise the price of beer ◆ Need the rule to occur frequently ◼ Amazon’s people who bought X also bought Y

ELSES PROPERTY PLAGIARISm ORK Applications-(2) Baskets sentences Items documents containing those sentences Items that appear together too often could represent plagiarism ◆ Notice items do not have to be“in” baskets Baskets= patients; Items drugs side-effects o has been used to detect combinations of drugs that result in particular side-effects But requires extension: Absence of an item needs to be observed as well as presence

Applications – (2) 5 ◼ Baskets = sentences; Items = documents containing those sentences ◆ Items that appear together too often could represent plagiarism ◆ Notice items do not have to be “in” baskets ◼ Baskets = patients; Items = drugs & side-effects ◆ Has been used to detect combinations of drugs that result in particular side-effects ◆ But requires extension: Absence of an item needs to be observed as well as presence

ELSES PROPERTY ARISm ORE IO Plagiarism Checker Similar Content search Online Plagiarism Paste Original content Here Paste Alternate content Here esting Plagiarism ABC College for Women is one of the most prestigious since the establishment of ABC College for Women and in early Januar institutions of London with a full time enrollment of about 8000 students 12, the University has tried its level best for improvement in Highes Government did variou institution have been shaped by its institutional history, which is spread foreign universities MoU with various national industries and linkages with t years. In 02, the University made all strong foreign universities have been established in the field of Pharmacy sions for the improvement in on. Established in May Electronics, Ent al Science, Fine Arts, Economics and Mass Communication. This is how they made the glorious academic values of University of the Oxford, it was housed in a building on XYZ Road,with his oldest premier post-graduate female institution very nicely ngth of 90 students and then the progress flourished with full shot And College started programs like Electronics, Environmental Science, ts, Economics and Mass Communication, Various national industries and linkages with Foreign Colleges helped a lot c842W:130s:6P:1 C:558W805:3P1 lear Highlight Clear all Import ar Highlight Clear all Occurances Density Matching Limit Case Sensitive Higher Education. Established 1% Scan Density 2.31% Statistics: Plagiarism ABC College 2.31% 30 %/o Duplicate Found! Education, Established in 231% a Export csy 39 Matches Detected Established in May Export相ML for Women s 231% Scan Now 6

6

Transaction data: a set of documents A text document data set, each document is treated as a“bag” of keywords doc 1 Student, Teach, School doc2 Student School docs Teach, School, City, Game doc Baseball, Basketball doc5 Basketball, Player, Spectator doc Baseball. Coach. Game, Team doc: Basketball, Team, City, Game

7 Transaction data: a set of documents ◼ A text document data set. Each document is treated as a “bag” of keywords doc1: Student, Teach, School doc2: Student, School doc3: Teach, School, City, Game doc4: Baseball, Basketball doc5: Basketball, Player, Spectator doc6: Baseball, Coach, Game, Team doc7: Basketball, Team, City, Game

The model: rules a transaction t contains x. a set of items (itemset)in / ifX c t An association rule is an implication of the form X→>Y, Where x,Ycl,andX∩Y= An itemset is a set of items + E.g., X=milk, bread, cereal) is an itemset ak-itemset is an itemset with k items E.g., milk, bread, cereal] is a 3-itemset

8 The model: rules ◼ A transaction t contains X, a set of items (itemset) in I, if X  t. ◼ An association rule is an implication of the form: X → Y, where X, Y  I, and X Y =  ◼ An itemset is a set of items. ◆ E.g., X = {milk, bread, cereal} is an itemset. ◼ A k-itemset is an itemset with k items. ◆ E.g., {milk, bread, cereal} is a 3-itemset

Rule strength measures (the transaction data set) if sup gor sup in T Support: The rule holds with support transactions containⅩ∪Y ◆Sp=Pr(x∪Y Confidence. The rule holds in t with confidence conf if conf of tranactions that contain x also contain y conf=Pr(r X) a An association rule is a pattern that states When x occurs. y occurs with certain probability

9 Rule strength measures ◼ Support: The rule holds with support sup in T (the transaction data set) if sup % of transactions contain X  Y. ◆ sup = Pr(X  Y). ◼ Confidence: The rule holds in T with confidence conf if conf % of tranactions that contain X also contain Y. ◆ conf = Pr(Y | X) ◼ An association rule is a pattern that states when X occurs, Y occurs with certain probability

Support and Confidence Support count: The support count of an itemset X, denoted by X count, in a data set T is the number of transactions in t that contain X assume t has n transactions Then (X∪Y) count support= (X∪) count confidence Xcount

10 Support and Confidence ◼ Support count: The support count of an itemset X, denoted by X.count, in a data set T is the number of transactions in T that contain X. Assume T has n transactions. ◼ Then, n X Y count support (  ). = X count X Y count confidence . (  ). =

点击下载完整版文档(PPTX)VIP每日下载上限内不扣除下载券和下载次数;
按次数下载不扣除下载券;
24小时内重复下载只扣除一次;
顺序:VIP每日次数-->可用次数-->下载券;
共53页,可试读18页,点击继续阅读 ↓↓
相关文档

关于我们|帮助中心|下载说明|相关软件|意见反馈|联系我们

Copyright © 2008-现在 cucdc.com 高等教育资讯网 版权所有