4.5.1 Recall We use TAMIFLEX [22]to find the targets accessed at reflective calls in our programs under the inputs described in Section 4.2.SOLAR is the only one to achieve total recall for all reflective targets accessed. hee凸器 ■SOLAR.ELF Here,we demonstrate one signifi- ■7518 ■ELF.DOOP cant benefit of achieving higher re- checkstyle 6707 ron。1267 call,in practice.Fig.13 compares alhn兰品a DOOP.ELF and SOLAR in terms pmd 310 of true caller-callee relations stati- fop3337 cally calculated and obtained by an eclpse 0400 06 instrumental tool written in terms chart 0513 of JAVASSIST [26].SOLAR recalls 0 1000 15000 3000 Fig.13.More true caller-callee relations a total of 371%(148%)more tar- found in recall by SOLAR than ELF (SOLAR- gets than Doop (ELF)at the calls ELF)and by ELF than DOoP (ELF-DOOP). to newInstance()and invoke(). translating into 49700(40570)more true caller-callee relations found for the 10 programs.These numbers are ex- pected to improve when more inputs are used.Note that all targets recalled by DooP are recalled by ELF and all targets recalled by ELF are recalled by SoLAR. These results demonstrate the effectiveness of our LHM and collective inference. Table 1.Precision comparison.There are two clients:DevirCall denotes the percentage of the virtual calls whose targets can be disambiguated and SafeCast denotes the percentage of the casts that can be statically shown to be safe. chart eclipse fop hsgldb pmd xalan avrora checkstyle findbugs freecs Average Devir Doop 94.9493.04 92.6593.4994.79 93.16 92.32 95.46 93.72 Call ELF 3.53 88.07 92.34 94.80 02.8792.7094.50 93.19 92.53 94.94 92.93 (%)SOLAR93.5187.6992.2694.51 192.3992.6592.43 9339 92.37 95.26 9263 Safe DooP 59.3453.68 15.4057.9756.12 50.19 45.78 59.71 53.24 Cast E1F49.8040.7155.4053.65 418.2459.2457.27 5179 48.54 59.14 5207 (%)SOLAR49.5338.0454.2153.11 44.5359.1152.56 49.40 43.60 57.96 49.79 4.5.2 Precision Tables 1 compares the precision of DooP,ELF and SOLAR with two popular clients.Note that Doop is unscalable for chart and hsqldb (under 3 hours)in our setting.Despite achieving better recall(Fig.13),SOLAR maintains nearly the same precision as DooP and ELF,which tend to be more under-approximate than SoLAR.This suggests that SoLAR's soundness-guided design is effective in balancing soundness,precision and scalability. 4.6 RQ4:Efficiency Table 2 compares the analysis times of DooP,ELF and SOLAR.Despite pro- ducing significantly better under-approximations than Doop and ELF,SOLAR is only several-fold slower.When analysing hsqldb,xalan and checkstyle,So- LAR requires some lightweight annotations.Their analysis times are the ones consumed by SOLAR on analysing the annotated programs.Note that these an- notated programs are also used by Doop and ELF(as discussed earlier). 5 Related Work In addition to the prior work already discussed in Section 2.2,we highlight below a few open-source static reflection analysis tools available.BDDBDDB [2] represents a partial implementation of the reflection analysis introduced in 20.4.5.1 Recall We use TamiFlex [22] to find the targets accessed at reflective calls in our programs under the inputs described in Section 4.2. Solar is the only one to achieve total recall for all reflective targets accessed. 0 10623 333 0 48 1171 0 415 158 513 513 20400 657 33 2310 833 1267 6707 7528 322 0 5000 10000 15000 20000 chart eclipse fop hsqldb pmd xalan avrora checkstyle findbugs freecs SOLAR - ELF ELF - DOOP Fig. 13. More true caller-callee relations found in recall by Solar than Elf (Solar´ Elf) and by Elf than Doop (Elf´Doop). Here, we demonstrate one signifi- cant benefit of achieving higher recall, in practice. Fig. 13 compares Doop, Elf and Solar in terms of true caller-callee relations statically calculated and obtained by an instrumental tool written in terms of Javassist [26]. Solar recalls a total of 371% (148%) more targets than Doop (Elf) at the calls to newInstance() and invoke(), translating into 49700 (40570) more true caller-callee relations found for the 10 programs. These numbers are expected to improve when more inputs are used. Note that all targets recalled by Doop are recalled by Elf and all targets recalled by Elf are recalled by Solar. These results demonstrate the effectiveness of our LHM and collective inference. Table 1. Precision comparison. There are two clients: DevirCall denotes the percentage of the virtual calls whose targets can be disambiguated and SafeCast denotes the percentage of the casts that can be statically shown to be safe. chart eclipse fop hsqldb pmd xalan avrora checkstyle findbugs freecs Average Devir Doop – 94.94 93.04 – 92.65 93.49 94.79 93.16 92.32 95.46 93.72 Call Elf 93.53 88.07 92.34 94.80 92.87 92.70 94.50 93.19 92.53 94.94 92.93 (%) Solar 93.51 87.69 92.26 94.51 92.39 92.65 92.43 93.39 92.37 95.26 92.63 Safe Doop – 59.34 53.68 – 45.40 57.97 56.12 50.19 45.78 59.71 53.24 Cast Elf 49.80 40.71 55.40 53.65 48.24 59.24 57.27 51.79 48.54 59.14 52.07 (%) Solar 49.53 38.04 54.21 53.11 44.53 59.11 52.56 49.40 43.60 57.96 49.79 4.5.2 Precision Tables 1 compares the precision of Doop, Elf and Solar with two popular clients. Note that Doop is unscalable for chart and hsqldb (under 3 hours) in our setting. Despite achieving better recall (Fig. 13), Solar maintains nearly the same precision as Doop and Elf, which tend to be more under-approximate than Solar. This suggests that Solar’s soundness-guided design is effective in balancing soundness, precision and scalability. 4.6 RQ4: Efficiency Table 2 compares the analysis times of Doop, Elf and Solar. Despite producing significantly better under-approximations than Doop and Elf, Solar is only several-fold slower. When analysing hsqldb, xalan and checkstyle, Solar requires some lightweight annotations. Their analysis times are the ones consumed by Solar on analysing the annotated programs. Note that these annotated programs are also used by Doop and Elf (as discussed earlier). 5 Related Work In addition to the prior work already discussed in Section 2.2, we highlight below a few open-source static reflection analysis tools available. bddbddb [2] represents a partial implementation of the reflection analysis introduced in [20]