正在加载图片...
第15卷第2期 智能系统学报 Vol.15 No.2 2020年3月 CAAI Transactions on Intelligent Systems Mar.2020 D0:10.11992/tis.201809030 网络出版地址:http:/kns.cnki.net/kcms/detail/23.1538.TP.20190513.1210.002.html 基于可决系数的自适应关联规则挖掘算法 王雪平,林甲祥,巫建伟2,高敏节 (1.福建农林大学计算机与信息学院,福建福州350002:2.自然资源部第三海洋研究所,福建厦门361001) 摘要:针对以频繁项集产生-规则产生为核心的两阶段关联规则挖掘,存在需要人工以先验知识指定最小支 持度和最小置信度阈值的缺陷。本文提出以支持数和置信度为依据,采用曲线拟合技术,根据可决系数自动确 定曲线的次数及对应多项式的算法AARM_BR(Adaptation Association Rule Mining Based on Determination Coeffi- cient R),从而确定支持度和置信度阈值。在标准数据集Trolley和Groceries上进行关联规则挖掘实验,结果表 明本算法更具有数据依赖性,在用户不具备先验知识的情况下,无须人为指定多项式阶次、支持度和置信度阈 值的优点。 关键词:关联规则;阶次;自适应;可决系数:规则;支持度;置信度:曲线拟合;多项式;数据挖掘 中图分类号:TP391 文献标志码:A 文章编号:1673-4785(2020)02-0352-08 中文引用格式:王雪平,林甲样,巫建伟,等.基于可决系数的自适应关联规则挖掘算法.智能系统学报,2020,15(2): 352-359. 英文引用格式:VANG Xueping,LIN Jiaxiang,.U Jianwei,et al.Adaptive-association-rule mining algorithm based on determina- tion coefficientJ].CAAI transactions on intelligent systems,2020,15(2):352-359. Adaptive-association-rule mining algorithm based on determination coefficient WANG Xueping',LIN Jiaxiang',WU Jianwei,GAO Minjie' (1.College of Computer and Information Sciences,Fujian Agriculture and Forestry University,Fuzhou 350002,China;2.Third Insti- tute of Oceanography,Ministry of Natural Resources,Xiamen 361001,China) Abstract:The two-stage association-rule-mining algorithm based on the frequent item set generation and rule genera- tion requires the manual assigning of minimum support and minimum confidence.To overcome this defect,this paper proposes a new method using the curve fitting technology based on the number of supports and confidence,in which the number of the order of curve and corresponding polynomial is automatically determined by a determination coefficient, which is called "adaptation association rule mining based on the determination coefficient R(AARM BR).As the pro- posed AARM_BR method is driven by data,the thresholds of support and confi-dence can be automatically obtained. The experiments on two standard datasets Trolley and Groceries show that compared with a recently published method, the proposed method is more data-dependent and automatically determines the number of order of polynomial and the threshold of support and confidence under the circumstance of not having a priori knowledge. Keywords:association rule;order,adaptive;coefficient of determination;rule;support;confidence;curve fitting;poly- nomial;data mining 收稿日期:2018-09-15.网络出版日期:2019-05-14. 关联规则挖掘是数据挖掘研究领域重要任务 基金项目:国家自然科学基金项目(41401458):福建省自然科 之一,目标就是从事务数据集中发现隐藏的、有 学基金项目(2018J01644,2018J01645,2016J01753) 中国-东盟海上合作基金项目(2020399):国家海洋 意义的联系,目前已广泛应用于购物篮分析、网 局第三海洋研究所项目(2016020):福建省中青年教 师教育科研项目(JT180129). 络入侵检测、关联规则分类、交通事故模式分析、 通信作者:王雪平,E-mail:熙gfvgu@163.com 药物成分关联分析、病人症型判断等领域。常DOI: 10.11992/tis.201809030 网络出版地址: http://kns.cnki.net/kcms/detail/23.1538.TP.20190513.1210.002.html 基于可决系数的自适应关联规则挖掘算法 王雪平1 ,林甲祥1 ,巫建伟2 ,高敏节1 (1. 福建农林大学 计算机与信息学院,福建 福州 350002; 2. 自然资源部第三海洋研究所,福建 厦门 361001) 摘 要:针对以频繁项集产生−规则产生为核心的两阶段关联规则挖掘,存在需要人工以先验知识指定最小支 持度和最小置信度阈值的缺陷。本文提出以支持数和置信度为依据,采用曲线拟合技术,根据可决系数自动确 定曲线的次数及对应多项式的算法 AARM_BR(Adaptation Association Rule Mining Based on Determination Coeffi￾cient R2 ),从而确定支持度和置信度阈值。在标准数据集 Trolley 和 Groceries 上进行关联规则挖掘实验,结果表 明本算法更具有数据依赖性,在用户不具备先验知识的情况下,无须人为指定多项式阶次、支持度和置信度阈 值的优点。 关键词:关联规则;阶次;自适应;可决系数;规则;支持度;置信度;曲线拟合;多项式;数据挖掘 中图分类号:TP391 文献标志码:A 文章编号:1673−4785(2020)02−0352−08 中文引用格式:王雪平, 林甲祥, 巫建伟, 等. 基于可决系数的自适应关联规则挖掘算法 [J]. 智能系统学报, 2020, 15(2): 352–359. 英文引用格式:WANG Xueping, LIN Jiaxiang, WU Jianwei, et al. Adaptive-association-rule mining algorithm based on determina￾tion coefficient[J]. CAAI transactions on intelligent systems, 2020, 15(2): 352–359. Adaptive-association-rule mining algorithm based on determination coefficient WANG Xueping1 ,LIN Jiaxiang1 ,WU Jianwei2 ,GAO Minjie1 (1. College of Computer and Information Sciences, Fujian Agriculture and Forestry University, Fuzhou 350002, China; 2. Third Insti￾tute of Oceanography, Ministry of Natural Resources, Xiamen 361001, China) Abstract: The two-stage association-rule-mining algorithm based on the frequent item set generation and rule genera￾tion requires the manual assigning of minimum support and minimum confidence. To overcome this defect, this paper proposes a new method using the curve fitting technology based on the number of supports and confidence, in which the number of the order of curve and corresponding polynomial is automatically determined by a determination coefficient, which is called “adaptation association rule mining based on the determination coefficient R2 ” (AARM_BR). As the pro￾posed AARM_BR method is driven by data, the thresholds of support and confi-dence can be automatically obtained. The experiments on two standard datasets Trolley and Groceries show that compared with a recently published method, the proposed method is more data-dependent and automatically determines the number of order of polynomial and the threshold of support and confidence under the circumstance of not having a priori knowledge. Keywords: association rule; order; adaptive; coefficient of determination; rule; support; confidence; curve fitting; poly￾nomial; data mining 关联规则挖掘是数据挖掘研究领域重要任务 之一,目标就是从事务数据集中发现隐藏的、有 意义的联系,目前已广泛应用于购物篮分析、网 络入侵检测、关联规则分类、交通事故模式分析、 药物成分关联分析、病人症型判断等领域[1-3]。常 收稿日期:2018−09−15. 网络出版日期:2019−05−14. 基金项目:国家自然科学基金项目 (41401458);福建省自然科 学基金项目 (2018J01644,2018J01645,2016J01753); 中国−东盟海上合作基金项目 (2020399);国家海洋 局第三海洋研究所项目 (2016020);福建省中青年教 师教育科研项目 (JT180129). 通信作者:王雪平,E-mail:gggfvgu@163.com. 第 15 卷第 2 期 智 能 系 统 学 报 Vol.15 No.2 2020 年 3 月 CAAI Transactions on Intelligent Systems Mar. 2020
向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有