正在加载图片...
How Far We Have Progressed in the Journey?An Examination of CPDP 1:3 size [20].This indicates that module size may have a capability in defect prediction similar to code metrics,the most commonly used predictors in the existing CPDP models.An interesting question hence arises:"If the prediction performance is the goal,then how do the existing CPDP models perform compared with ManualDown and ManualUp?"This question is important for both practitioners and researchers.For practitioners,it will help to determine whether it is worth to apply the existing CPDP models in practice.If simple module size models perform similarly or even better,then there is no practical reason to apply complex CPDP models.For researchers,if simple module size models perform similarly or even better,then there is a strong need to improve the prediction power of the existing CPDP models.Otherwise,the motivation of applying those CPDP models could not be well justified.To the best of our knowledge,of the existing CPDP studies,only two studies compared the proposed CPDP models against simple module size models [9,12].In other words,for most of the existing CPDP models,it remains unclear whether they have a superior prediction performance compared with simple module size models. In this article,we attempt to investigate how far we have really progressed in the journey by conducting a comparison in defect prediction performance between the existing CPDP models and simple module size models.We want to know not only whether the difference in defect prediction performance is statistically significant but also whether it is of practical importance. In our study,we take the following two measures to ensure a fair comparison.On the one hand, we use the same publicly available datasets and the same performance indicators to collect the performance values for the simple module size models as those used in the existing CPDP studies. On the other hand,we do not attempt to re-implement the CPDP models to collect their prediction performance values.In contrast,we use the prediction performance values reported in the original CPDP studies to conduct the comparison.Therefore,the implementation bias can be avoided. These measures ensure that we can draw a reliable conclusion on the benefits of the existing CPDP models w.r.t.simple module size models in defect prediction performance in practice. Under the above experimental settings,we perform an extensive comparison between the existing CPDP models and simple module size models for defect prediction.Surprisingly,our experimental results show that simple module size models have a prediction performance comparable or even superior to most of the existing CPDP models in the literature,including many newly proposed models.Consequently,for practitioners,it would be better to apply simple module size models rather than the existing CPDP models to predict defects in a target project. This is especially true when taking into account the application cost(including metrics collection cost and model building cost).The results reveal that,if the prediction performance is the goal, the current progress in kjjc zCPDP studies is not being achieved as it might have been envisaged We hence strongly recommend that future CPDP studies should consider simple module size models as the baseline models to be compared against.The benefits of using a baseline model are two-fold [112].On the one hand,this would ensure that the predictive power of a newly proposed CPDP model could be adequately compared and assessed.On the other hand,"the ongoing use of a baseline model in the literature would give a single point of comparison".This will allow a meaningful assessment of any new CPDP model against previous CPDP models. The rest of this article is organized as follows.Section 2 introduces the background on cross- project defect prediction,including the problem studied,the general framework,the performance evaluation indicators,and the state of progress.Section 3 describes the experimental design,in- cluding the simple module size model,the research questions,the datasets,and the data analysis method.Section 4 presents in detail the experimental results.Section 5 compares the simple module size models with related work.Section 6 discusses the results and implications.Section 7 analyzes the threats to the validity of our study.Section 8 concludes the article and outlines directions for future work. ACM Transactions on Software Engineering and Methodology.Vol.27.No.1,Article 1.Pub.date:April 2018.How Far We Have Progressed in the Journey? An Examination of CPDP 1:3 size [20]. This indicates that module size may have a capability in defect prediction similar to code metrics, the most commonly used predictors in the existing CPDP models. An interesting question hence arises: “If the prediction performance is the goal, then how do the existing CPDP models perform compared with ManualDown and ManualUp?” This question is important for both practitioners and researchers. For practitioners, it will help to determine whether it is worth to apply the existing CPDP models in practice. If simple module size models perform similarly or even better, then there is no practical reason to apply complex CPDP models. For researchers, if simple module size models perform similarly or even better, then there is a strong need to improve the prediction power of the existing CPDP models. Otherwise, the motivation of applying those CPDP models could not be well justified. To the best of our knowledge, of the existing CPDP studies, only two studies compared the proposed CPDP models against simple module size models [9, 12]. In other words, for most of the existing CPDP models, it remains unclear whether they have a superior prediction performance compared with simple module size models. In this article, we attempt to investigate how far we have really progressed in the journey by conducting a comparison in defect prediction performance between the existing CPDP models and simple module size models. We want to know not only whether the difference in defect prediction performance is statistically significant but also whether it is of practical importance. In our study, we take the following two measures to ensure a fair comparison. On the one hand, we use the same publicly available datasets and the same performance indicators to collect the performance values for the simple module size models as those used in the existing CPDP studies. On the other hand, we do not attempt to re-implement the CPDP models to collect their prediction performance values. In contrast, we use the prediction performance values reported in the original CPDP studies to conduct the comparison. Therefore, the implementation bias can be avoided. These measures ensure that we can draw a reliable conclusion on the benefits of the existing CPDP models w.r.t. simple module size models in defect prediction performance in practice. Under the above experimental settings, we perform an extensive comparison between the existing CPDP models and simple module size models for defect prediction. Surprisingly, our experimental results show that simple module size models have a prediction performance comparable or even superior to most of the existing CPDP models in the literature, including many newly proposed models. Consequently, for practitioners, it would be better to apply simple module size models rather than the existing CPDP models to predict defects in a target project. This is especially true when taking into account the application cost (including metrics collection cost and model building cost). The results reveal that, if the prediction performance is the goal, the current progress in kjjc zCPDP studies is not being achieved as it might have been envisaged. We hence strongly recommend that future CPDP studies should consider simple module size models as the baseline models to be compared against. The benefits of using a baseline model are two-fold [112]. On the one hand, this would ensure that the predictive power of a newly proposed CPDP model could be adequately compared and assessed. On the other hand, “the ongoing use of a baseline model in the literature would give a single point of comparison”. This will allow a meaningful assessment of any new CPDP model against previous CPDP models. The rest of this article is organized as follows. Section 2 introduces the background on cross￾project defect prediction, including the problem studied, the general framework, the performance evaluation indicators, and the state of progress. Section 3 describes the experimental design, in￾cluding the simple module size model, the research questions, the datasets, and the data analysis method. Section 4 presents in detail the experimental results. Section 5 compares the simple module size models with related work. Section 6 discusses the results and implications. Section 7 analyzes the threats to the validity of our study. Section 8 concludes the article and outlines directions for future work. ACM Transactions on Software Engineering and Methodology, Vol. 27, No. 1, Article 1. Pub. date: April 2018
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有