当前位置:高等教育资讯网  >  中国高校课件下载中心  >  大学文库  >  浏览文档

重庆大学:《数据仓库与数据挖掘 Data Warehouse and Data mining》课程PPT教学课件(英文版)Chapter 6 Advanced Frequent Pattern Mining

资源类别:文库,文档格式:PPT,文档页数:64,文件大小:2.25MB,团购合买
◼ Pattern Mining: A Road Map ◼ Pattern Mining in Multi-Level, Multi-Dimensional Space ◼ Constraint-Based Frequent Pattern Mining ◼ Mining High-Dimensional Data and Colossal Patterns ◼ Mining Compressed or Approximate Patterns ◼ Pattern Exploration and Application ◼ Summary
点击下载完整版文档(PPT)

Chapter 6: Advanced Frequent Pattern Mining Pattern Mining: A road Map Pattern Mining in Multi-Level, multi-Dimensional space Constraint-Based Frequent Pattern Mining Mining High-Dimensional Data and Colossal Patterns Mining Compressed or Approximate Patterns Pattern Exploration and application ■ Summary

1 Chapter 6 : Advanced Frequent Pattern Mining ◼ Pattern Mining: A Road Map ◼ Pattern Mining in Multi-Level, Multi-Dimensional Space ◼ Constraint-Based Frequent Pattern Mining ◼ Mining High-Dimensional Data and Colossal Patterns ◼ Mining Compressed or Approximate Patterns ◼ Pattern Exploration and Application ◼ Summary

frequent patterm Basic Pattens association rule closed/max patten ■ generator Kinds of Multilevel a multilevel(uniform, varied, or itemset -based support) patterns Multidimensional a multidimensional pattern( incl high-dimensional patten) and rules Pattems a continuous data( discretization -based, or statistical) ■ approximate pattem ■ uncertain pattem Extended Patterns ■ compressed patten a rare pattern/negative pattem iEEE a high-dimensional and colossal pattens a candidate generation( Apriori, partitioning, sampling, - Basic Mining Pattem growth( FPgrowth, HMine, FPMax, Closet+, -. Methods 555a55 a vertical format( EClat, CHARM,. a interestingness(subjective vs objective) Mining Methods Mining Interesting ■ constraint-based mining Patterns correlation rules ■ exception rules Distributed parallel a distributed/parallel mining incremental incremental mining stream pattern sequential ad time-series patterns structural(e. g, tree, lattice, graph)pattens Extended Data spatial(e. g, Co-location) pattern Types temporal(evolutionary, perodic) o image, video and multimedia pattems Extensions ■ network pattems Application g pattem-based classification pattem-based clustering Applications pattem-based semantic annotation collaborative flitering ■ pnvacy-preserving

Research on Pattern Mining: A Road Map 2

Pattern Mining in Multi-Level, multi- Dimensional Space Mining multi-Level Association Mining multi-Dimensional association Mining Quantitative Association Rules Mining rare patterns and Negative patterns

3 Pattern Mining in Multi-Level, Multi￾Dimensional Space ◼ Mining Multi-Level Association ◼ Mining Multi-Dimensional Association ◼ Mining Quantitative Association Rules ◼ Mining Rare Patterns and Negative Patterns

Computer Software Printer and Camera Computer Accessory Laptop Digital Desktop Office AntivirusPrinter Wrist Pad Mouse Camera IBM Dell Mirosot Canon…| Fellowes……| LogiTech ////八 Figure 7.2 Concept hierarchy for AllElectronics computer items

Figure 7.2 Concept hierarchy for AllElectronics computer items

Level 1 min_ sup=5% computer support=10%] Level 2 min_ sup=5% laptop computetsuppot=6%] desktop computer [support=4%] Figure 7.3 Multilevel mining with uniform support

Figure 7.3 Multilevel mining with uniform support

Level 1 min sup=5% computer(support= 10%] Level 2 min sup=3% laptop computetsupport =6%] desktop computer [suppot=4%] Figure 7. 4 Multilevel mining with reduced support

Figure 7.4 Multilevel mining with reduced support

Mining Multiple-Level Association Rules Items often form hierarchies Flexible support settings Items at the lower level are expected to have lower support Exploration of shared multi-level mining(agrawal Srikant@VLB95, Han Fu@VLDB95) uniform support reduced support Level l Milk min sup =5% Level 1 Support=10%1 min sup=5% Level 2 Milk Skim milk Level2 min_sup=5% [support=6%1: [support=4%1 min sup =3% 7

7 Mining Multiple-Level Association Rules ◼ Items often form hierarchies ◼ Flexible support settings ◼ Items at the lower level are expected to have lower support ◼ Exploration of shared multi-level mining (Agrawal & Srikant@VLB’95, Han & Fu@VLDB’95) uniform support Milk [support = 10%] 2% Milk [support = 6%] Skim Milk [support = 4%] Level 1 min_sup = 5% Level 2 min_sup = 5% Level 1 min_sup = 5% Level 2 min_sup = 3% reduced support

Multi-level Association: Flexible Support and Redundancy filtering Flexible min-support thresholds: Some items are more valuable but less frequent Use non-uniform, group-based min-support E.g. diamond watch, camera]: 0. 05% bread milk 5%/ Redundancy filtering Some rules may be redundant due to ancestor"relationships between items milk= wheat bread [support=8%, confidence= 70%] 2 milk wheat bread [support= 2%, confidence = 72%] The first rule is an ancestor of the second rule a rule is redundant if its support is close to the expected"value based on the rule's ancestor

8 Multi-level Association: Flexible Support and Redundancy filtering ◼ Flexible min-support thresholds: Some items are more valuable but less frequent ◼ Use non-uniform, group-based min-support ◼ E.g., {diamond, watch, camera}: 0.05%; {bread, milk}: 5%; … ◼ Redundancy Filtering: Some rules may be redundant due to “ancestor” relationships between items ◼ milk  wheat bread [support = 8%, confidence = 70%] ◼ 2% milk  wheat bread [support = 2%, confidence = 72%] The first rule is an ancestor of the second rule ◼ A rule is redundant if its support is close to the “expected” value, based on the rule’s ancestor

Mining Multi-Dimensional Association Single-dimensional rules buys(X,"milk)= buys(X,"bread) Multi-dimensional rules:22 dimensions or predicates Inter-dimension assoc rules(no repeated predicates) age(X, 19-25)A occupation(X, student)= buys(X,"coke hybrid-dimension assoc rules(repeated predicates) age(X, 19-25)A buys(X, popcorn)= buys(X,coke") Categorical Attributes: finite number of possible values,no ordering among values--data cube approach Quantitative Attributes: Numeric, implicit ordering among valuesdiscretization, clustering and gradient approaches

9 Mining Multi-Dimensional Association ◼ Single-dimensional rules: buys(X, “milk”)  buys(X, “bread”) ◼ Multi-dimensional rules:  2 dimensions or predicates ◼ Inter-dimension assoc. rules (no repeated predicates) age(X,”19-25”)  occupation(X,“student”)  buys(X, “coke”) ◼ hybrid-dimension assoc. rules (repeated predicates) age(X,”19-25”)  buys(X, “popcorn”)  buys(X, “coke”) ◼ Categorical Attributes: finite number of possible values, no ordering among values—data cube approach ◼ Quantitative Attributes: Numeric, implicit ordering among values—discretization, clustering, and gradient approaches

Mining Quantitative Associations Techniques can be categorized by how numerical attributes such as age or salary are treated 1. Static discretization based on predefined concept hierarchies(data cube methods) 2. Dynamic discretization based on data distribution (quantitative rules eg Agrawal srikant@SIGMOD96 3. Clustering: Distance-based association(e.g. Yang Miller@SIGMOD97 One dimensional clustering then association 4. Deviation:(such as Aumann and Lindell@KDD99) Sex= female = Wage: mean=$7/hr(overall mean= $9)

10 Mining Quantitative Associations Techniques can be categorized by how numerical attributes, such as age or salary are treated 1. Static discretization based on predefined concept hierarchies (data cube methods) 2. Dynamic discretization based on data distribution (quantitative rules, e.g., Agrawal & Srikant@SIGMOD96) 3. Clustering: Distance-based association (e.g., Yang & Miller@SIGMOD97) ◼ One dimensional clustering then association 4. Deviation: (such as Aumann and Lindell@KDD99) Sex = female => Wage: mean=$7/hr (overall mean = $9)

点击下载完整版文档(PPT)VIP每日下载上限内不扣除下载券和下载次数;
按次数下载不扣除下载券;
24小时内重复下载只扣除一次;
顺序:VIP每日次数-->可用次数-->下载券;
共64页,可试读20页,点击继续阅读 ↓↓
相关文档

关于我们|帮助中心|下载说明|相关软件|意见反馈|联系我们

Copyright © 2008-现在 cucdc.com 高等教育资讯网 版权所有