Parallel Type-Consistency Checks A synchronization- ferent pair-wise type-consistency tests are performed in par free parallelization scheme is used.This is achieved by re- allel.as discussed in Section 5.with 8 threads on 4 cores. quiring different threads to merge objects of different types Table 2 presents the main results,which will be ana- (with every thread executing lines 6-13 of Algorithm 1).To lyzed when our research questions are discussed below.For avoid synchronizations,object merging takes place only at a program,we consider the abstract objects reachable from line 13 of Algorithm 1,and in addition,all shared automata main()in both the application and library code. are constructed beforehand and concurrently read only. 6.1 RO1:MAHJONG's Effectiveness as a Pre-Analysis 6. Evaluation 6.1.1 Efficiency We show that MAHJONG is effective in significantly scal- The overall pre-analysis phase is fast,as shown in Column 2 ing context-sensitive points-to analyses for large Java pro- of Table 2.For a program,its analysis time is broken down grams while achieving nearly the same precision for type- into three components,taken by ci (the context-insensitive dependent clients.We address two major research questions: points-to analysis),FPG (a module for building its FPG), RQ1.Is MAHJONG effective as a pre-analysis? MAHJONG (for creating a new heap abstraction).For all the 12 programs,the average analysis time for ci is 62.3 seconds. (a)Is MAHJONG lightweight for large programs? The runtime overheads for the other two are negligible. (b)Can MAHJONG avoid the allocation-site abstraction's The efficiency of MAHJONG cannot be over-emphasized, heap over-partitioning for type-dependent clients? as it could not otherwise be used as an enabling technol- RO2.Is MAHJONG-based points-to analysis effective? ogy for a subsequent points-to analysis.On average,a FPG consists of 10073 objects of 1559 types with 2411 fields. (a)Can MAHJONG accelerate different types of main- MAHJONG builds an NFA for each object in the FPG,with stream context-sensitive points-to analyses? its size measured in terms of its number of states.The av- (b)Can MAHJONG achieve comparable precision as the erage sizes of NFAs range from 356 in luindex to 3789 allocation-site abstraction for type-dependent clients? in eclipse,with an average of 992.For each program,the smallest NFA always has one state only.Across all the pro- Type-Dependent Clients We consider three representative grams,the sizes of their largest NFAs range from 1935 in type-dependent clients,call graph construction,devirtualiza- luindex to 10034 in eclipse.This costs MAHJONG only tion and may-fail casting,provided by DooP [14]. an average of 3.8 seconds for each program.Such good per- Context-Sensitive Points-to Analyses We consider five formance is due to both our design (by merging objects in context-sensitive points-to analyses also from Doop as terms of merging equivalent automata)and several effective baselines.These cover the three main types of mainstream optimizations performed (see Section 5). context-sensitivity:call-site-sensitivity [15,22,36,42,51]. object-sensitivity [29,40,48]and type-sensitivity [39]. 6.1.2 Heap Partitioning We also provide experimental evidence on why context- Figure 8 shows that MAHJONG can alleviate the heap over- insensitivity is inadequate for type-dependent clients. partitioning problem suffered by the allocation-site abstrac- tion effectively for type-dependent clients.The allocation- Benchmarks We consider 12 large Java programs in- site abstraction creates an average of 10073 objects per pro- cluding 3 popular applications findbugs,checkstyle gram,ranging from 6190 in luindex to 19529 in eclipse. and JPC and all standard DaCapo benchmarks [12]except In contrast,MAHJONG creates an average of 3826 objects iython and hsgldb as they are not scalable under 3 out per program,ranging from 2108 in luindex to 9414 in of the 5 baseline analyses with and without MAHJONG. eclipse.This represents an average reduction of 62%. These programs are all analyzed with a large Java library Let us examine checkstyle in detail.As shown in Fig- JDK1.6.0.45. ure 8,a total of 10888 objects are created by the allocation- As a static reflection analysis may affect the efficiency site abstraction but only 4028 objects by MAHJONG. and precision of points-to analysis [24,25,38],we adopt the Given the heap partitioned as H/=for checkstyle, same resolution results generated by a dynamic reflection Figure 9 relates the number of equivalence classes with analysis tool,TAMIFLEX [8],in both the five baselines and a particular equivalence class size.In the left-most point their corresponding MAHJONG-based points-to analyses. marked by (1,3769),for example,there are 3769 equiva- Computing Platform We have done our experiments on a lence classes containing one object each.Thus,neither ob- Xeon E5-1620 3.7GHz machine with 128GB of RAM.The ject is merged with any other objects. analysis time of a program is the average of 3 runs. Let us examine some equivalence classes,given in Ta- ble 1,with their ranks(measured in decreasing order of their Pre-Analysis For this,we use the fast context-insensitive sizes)shown as well.For StringBuilder(Row 1),all their points-to analysis,denoted ci,provided by Doop [14].Dif- objects are type-consistent(reaching only char[]objects 286Parallel Type-Consistency Checks A synchronizationfree parallelization scheme is used. This is achieved by requiring different threads to merge objects of different types (with every thread executing lines 6 – 13 of Algorithm 1). To avoid synchronizations, object merging takes place only at line 13 of Algorithm 1, and in addition, all shared automata are constructed beforehand and concurrently read only. 6. Evaluation We show that MAHJONG is effective in significantly scaling context-sensitive points-to analyses for large Java programs while achieving nearly the same precision for typedependent clients. We address two major research questions: RQ1. Is MAHJONG effective as a pre-analysis? (a) Is MAHJONG lightweight for large programs? (b) Can MAHJONG avoid the allocation-site abstraction’s heap over-partitioning for type-dependent clients? RQ2. Is MAHJONG-based points-to analysis effective? (a) Can MAHJONG accelerate different types of mainstream context-sensitive points-to analyses? (b) Can MAHJONG achieve comparable precision as the allocation-site abstraction for type-dependent clients? Type-Dependent Clients We consider three representative type-dependent clients, call graph construction, devirtualization and may-fail casting, provided by DOOP [14]. Context-Sensitive Points-to Analyses We consider five context-sensitive points-to analyses also from DOOP as baselines. These cover the three main types of mainstream context-sensitivity: call-site-sensitivity [15, 22, 36, 42, 51], object-sensitivity [29, 40, 48] and type-sensitivity [39]. We also provide experimental evidence on why contextinsensitivity is inadequate for type-dependent clients. Benchmarks We consider 12 large Java programs including 3 popular applications findbugs, checkstyle and JPC and all standard DaCapo benchmarks [12] except jython and hsqldb as they are not scalable under 3 out of the 5 baseline analyses with and without MAHJONG. These programs are all analyzed with a large Java library JDK1.6.0 45. As a static reflection analysis may affect the efficiency and precision of points-to analysis [24, 25, 38], we adopt the same resolution results generated by a dynamic reflection analysis tool, TAMIFLEX [8], in both the five baselines and their corresponding MAHJONG-based points-to analyses. Computing Platform We have done our experiments on a Xeon E5-1620 3.7GHz machine with 128GB of RAM. The analysis time of a program is the average of 3 runs. Pre-Analysis For this, we use the fast context-insensitive points-to analysis, denoted ci, provided by DOOP [14]. Different pair-wise type-consistency tests are performed in parallel, as discussed in Section 5, with 8 threads on 4 cores. Table 2 presents the main results, which will be analyzed when our research questions are discussed below. For a program, we consider the abstract objects reachable from main() in both the application and library code. 6.1 RQ1: MAHJONG’s Effectiveness as a Pre-Analysis 6.1.1 Efficiency The overall pre-analysis phase is fast, as shown in Column 2 of Table 2. For a program, its analysis time is broken down into three components, taken by ci (the context-insensitive points-to analysis), FPG (a module for building its FPG), MAHJONG (for creating a new heap abstraction). For all the 12 programs, the average analysis time for ci is 62.3 seconds. The runtime overheads for the other two are negligible. The efficiency of MAHJONG cannot be over-emphasized, as it could not otherwise be used as an enabling technology for a subsequent points-to analysis. On average, a FPG consists of 10073 objects of 1559 types with 2411 fields. MAHJONG builds an NFA for each object in the FPG, with its size measured in terms of its number of states. The average sizes of NFAs range from 356 in luindex to 3789 in eclipse, with an average of 992. For each program, the smallest NFA always has one state only. Across all the programs, the sizes of their largest NFAs range from 1935 in luindex to 10034 in eclipse. This costs MAHJONG only an average of 3.8 seconds for each program. Such good performance is due to both our design (by merging objects in terms of merging equivalent automata) and several effective optimizations performed (see Section 5). 6.1.2 Heap Partitioning Figure 8 shows that MAHJONG can alleviate the heap overpartitioning problem suffered by the allocation-site abstraction effectively for type-dependent clients. The allocationsite abstraction creates an average of 10073 objects per program, ranging from 6190 in luindex to 19529 in eclipse. In contrast, MAHJONG creates an average of 3826 objects per program, ranging from 2108 in luindex to 9414 in eclipse. This represents an average reduction of 62%. Let us examine checkstyle in detail. As shown in Figure 8, a total of 10888 objects are created by the allocationsite abstraction but only 4028 objects by MAHJONG. Given the heap partitioned as H / ≡ for checkstyle, Figure 9 relates the number of equivalence classes with a particular equivalence class size. In the left-most point marked by (1, 3769), for example, there are 3769 equivalence classes containing one object each. Thus, neither object is merged with any other objects. Let us examine some equivalence classes, given in Table 1, with their ranks (measured in decreasing order of their sizes) shown as well. For StringBuilder (Row 1), all their objects are type-consistent (reaching only char[] objects 286