most UNIX systems. We choose these fi_中国高校课件下载中心

点击下载：Are Slice-Based Cohesion Metrics Actually Useful in Effort-Aware Post-Release Fault-Proneness Prediction? An Empirical Study

正在加载图片...

342 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING,VOL 41,NO.4,APRIL 2015 TABLE 6 Studied Projects and Version Information Subject release Previous release Common functions Common functions before preprocessing after preprocessing System Version Release date KSLOC #function #faulty function #function #faulty function Version Release date Bash 3.0 03-08-2004 55 1476 43 1403 40 2.05b 17-07-2002 Gcc-core 3.4.0 18-04-2004 411 6139 219 6066 210 3.3 14-05-2003 Gimp 2.0.0 23-03-2004 434 12110 469 11521 447 1.3.0 13-11-2001 Subversion 1.2.0 23-05-2005 181 2350 30 2003 29 1.1.0 29-09-2004 Vim 6.2 01-07-2003 123 2400 407 2342 398 6.1 24-03-2002 most UNIX systems.We choose these five projects as the sub- database stored information about entities (such as jects of our study for the following reasons:(1)their patch functions and variables)and references(such as func- files or bug-fixing release versions are publicly available, tion call and variable references).Then,we collected thus allowing us to collect post-release fault data at the func- the most commonly used 16 product metrics for each tion level;(2)they can be successfully analyzed by the value function of a system. analysis plug-in in the code analysis tool Frama-C [57],thus Step 2:Collect the baseline process metrics for each function. enabling us to collect slice-based cohesion metrics;(3)they For each project,we used the tool "Understand"to have moderate percentage post-release faulty functions generate two Understand databases:one for the inves- which is suitable for our experiments;and(4)they are non- tigated version and another for the previously released trivial software belonging to different problem domains. version.After that,we collected the three code churn In our study,we collect the data for Bash 3.0,Gcc-core metrics by using the commonly used diff algorithm 3.4.0,Gimp 2.0.0,Subversion 1.2.0,and Vim 6.2.We use these 156]to compare the functions appearing in these two five systems to evaluate the prediction effectiveness of post- databases.In this study,the blank line and comments release fault-proneness prediction models under the cross- in those functions are not counted when computing validation and across-project prediction methods.It is easy the code chum metrics.The last two columns in Table6 to know that,under the across-project prediction method, show for each project the previously released version each prediction model will produce 5x(5-1)=20 predic- used for computing the code churn metrics. tion effectiveness values.Furthermore,we collect the data Step 3:Determine the faulty or not-faulty labels for each for Bash 3.1,3.2,4.0,4.1,4.2,and 4.3.Note that,Bash 4.3, function after release.On the one hand,the project web- released on 26 February 2014,is the latest version of the Bash sites for Bash3 and Vim publish a number of patch system till now.We use Bash 3.0,Bash 3.1,Bash 3.2,Bash 4.0, files for fixing bugs reported after release.Each patch Bash 4.1,Bash 4.2,and Bash 4.3 to evaluate the prediction file not only describes the problem reported but also effectiveness of post-release fault-proneness prediction mod- gives the patch to fix the corresponding problem.By els under the across-version prediction method.Under the analyzing these patches,we were able to determine first across-version prediction approach (i.e.next-version which functions needed to be changed for fixing the prediction),each prediction model will produce 7-1=6 problem.If a function had code changes by these prediction effectiveness values.However,under the second patches,it will be marked as a faulty function and oth- across-version prediction approach (i.e.follow-up-version erwise not-faulty.On the other hand,Gcc-3.4.6,Gimp prediction),each prediction model will produce 7x(7-1)/ 2.0.6,5 and Subversion 1.2.3 are the latest bug-fixing 2=21 prediction effectiveness values. releases to Gcc-core 3.4.0,Gimp 2.0.0,and Subversion 1.2.0,respectively.These bug-fixing releases did not 4.2 Data Collection add any new features to the corresponding systems We collected the data from the above-mentioned five proj- thus enabling us to determine which functions had ects.Each data point of a data set corresponds to one C func- code changes for fixing bugs.If a function had code tion and consists of:1)16 product metrics(1 size metric+11 changes in the latest bug-fixing releases,it will be structural complexity metrics +4 Halstead's software sci- marked as a faulty function.Otherwise,the function is ence metrics);2)three process metrics (i.e.code churn met- a not-faulty function.This is one of the most com- rics);3)eight slice-based cohesion metrics;and 4)the monly used ways to determine faulfy functions [31]. faulty/not-faulty labels of the functions after release.We Step 4:Compute slice-based cohesion metrics for each obtained the data by the following steps: function using the tool "Frama-C".We use intra- Step 1:Collect the baseline product metrics for each func- tion using the tool "UInderstand".For each system,we 3.http://ftp.gnu.org/gnu/bash first generated an Understand database using the 4.http://ftp.vim.org/pub/vim/patches program-understanding tool "Understand".2 This 5.http://gcc.gnu.org/releases.html 6.http://www.gimpusers.com/forums/gimp-user/1786- announce-gimp-2-0-6 2.http://www.scitools.com 7.http://subversion.apache.org/docs/release-notes/1.2.htmlmost UNIX systems. We choose these five projects as the subjects of our study for the following reasons: (1) their patch files or bug-fixing release versions are publicly available, thus allowing us to collect post-release fault data at the function level; (2) they can be successfully analyzed by the value analysis plug-in in the code analysis tool Frama-C [57], thus enabling us to collect slice-based cohesion metrics; (3) they have moderate percentage post-release faulty functions which is suitable for our experiments; and (4) they are nontrivial software belonging to different problem domains. In our study, we collect the data for Bash 3.0, Gcc-core 3.4.0, Gimp 2.0.0, Subversion 1.2.0, and Vim 6.2. We use these five systems to evaluate the prediction effectiveness of postrelease fault-proneness prediction models under the crossvalidation and across-project prediction methods. It is easy to know that, under the across-project prediction method, each prediction model will produce 5 ð5 1Þ ¼ 20 prediction effectiveness values. Furthermore, we collect the data for Bash 3.1, 3.2, 4.0, 4.1, 4.2, and 4.3. Note that, Bash 4.3, released on 26 February 2014, is the latest version of the Bash system till now. We use Bash 3.0, Bash 3.1, Bash 3.2, Bash 4.0, Bash 4.1, Bash 4.2, and Bash 4.3 to evaluate the prediction effectiveness of post-release fault-proneness prediction models under the across-version prediction method. Under the first across-version prediction approach (i.e. next-version prediction), each prediction model will produce 7 1 ¼ 6 prediction effectiveness values. However, under the second across-version prediction approach (i.e. follow-up-version prediction), each prediction model will produce 7 ð7 1Þ= 2 ¼ 21 prediction effectiveness values. 4.2 Data Collection We collected the data from the above-mentioned five projects. Each data point of a data set corresponds to one C function and consists of: 1) 16 product metrics (1 size metric þ 11 structural complexity metrics þ 4 Halstead’s software science metrics); 2) three process metrics (i.e. code churn metrics); 3) eight slice-based cohesion metrics; and 4) the faulty/not-faulty labels of the functions after release. We obtained the data by the following steps: Step 1: Collect the baseline product metrics for each function using the tool “Understand”. For each system, we first generated an Understand database using the program-understanding tool “Understand”.2 This database stored information about entities (such as functions and variables) and references (such as function call and variable references). Then, we collected the most commonly used 16 product metrics for each function of a system. Step 2: Collect the baseline process metrics for each function. For each project, we used the tool “Understand” to generate two Understand databases: one for the investigated version and another for the previously released version. After that, we collected the three code churn metrics by using the commonly used diff algorithm [56] to compare the functions appearing in these two databases. In this study, the blank line and comments in those functions are not counted when computing the code churn metrics. The last two columns in Table 6 show for each project the previously released version used for computing the code churn metrics. Step 3: Determine the faulty or not-faulty labels for each function after release. On the one hand, the project websites for Bash3 and Vim4 publish a number of patch files for fixing bugs reported after release. Each patch file not only describes the problem reported but also gives the patch to fix the corresponding problem. By analyzing these patches, we were able to determine which functions needed to be changed for fixing the problem. If a function had code changes by these patches, it will be marked as a faulty function and otherwise not-faulty. On the other hand, Gcc-3.4.6,5 Gimp 2.0.6,6 and Subversion 1.2.37 are the latest bug-fixing releases to Gcc-core 3.4.0, Gimp 2.0.0, and Subversion 1.2.0, respectively. These bug-fixing releases did not add any new features to the corresponding systems, thus enabling us to determine which functions had code changes for fixing bugs. If a function had code changes in the latest bug-fixing releases, it will be marked as a faulty function. Otherwise, the function is a not-faulty function. This is one of the most commonly used ways to determine faulty functions [31]. Step 4: Compute slice-based cohesion metrics for each function using the tool “Frama-C”. We use intraTABLE 6 Studied Projects and Version Information Subject release Previous release Common functions before preprocessing Common functions after preprocessing System Version Release date KSLOC #function #faulty function #function #faulty function Version Release date Bash 3.0 03–08–2004 55 1476 43 1403 40 2.05b 17–07–2002 Gcc-core 3.4.0 18–04–2004 411 6139 219 6066 210 3.3 14–05–2003 Gimp 2.0.0 23–03–2004 434 12110 469 11521 447 1.3.0 13–11–2001 Subversion 1.2.0 23–05–2005 181 2350 30 2003 29 1.1.0 29–09–2004 Vim 6.2 01–07–2003 123 2400 407 2342 398 6.1 24–03–2002 2. http://www.scitools.com 3. http://ftp.gnu.org/gnu/bash 4. http://ftp.vim.org/pub/vim/patches 5. http://gcc.gnu.org/releases.html 6. http://www.gimpusers.com/forums/gimp-user/1786- announce-gimp-2-0-6 7. http://subversion.apache.org/docs/release-notes/1.2.html 342 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. 41, NO. 4, APRIL 2015

<<向上翻页向下翻页>>

点击下载：Are Slice-Based Cohesion Metrics Actually Useful in Effort-Aware Post-Release Fault-Proneness Prediction? An Empirical Study