Scalability-First Pointer Analysis with Self-Tuning Context-Sensitivity Yue Li Tian Tan Anders Moller Yannis Smaragdakis Aarhus University Aarhus University Aarhus University University of Athens yueli@cs.au.dk tiantan@cs.au.dk amoeller@cs.au.dk smaragd@di.uoa.gr ABSTRACT timeout(>10800) ▣2obj▣2type▣Cl Context-sensitivity is important in pointer analysis to ensure high precision,but existing techniques suffer from unpredictable scala- bility.Many variants of context-sensitivity exist,and it is difficult to choose one that leads to reasonable analysis time and obtains 5000 high precision,without running the analysis multiple times. 4000 We present the SCALER framework that addresses this problem SCALER efficiently estimates the amount of points-to information 3000 that would be needed to analyze each method with different variants of context-sensitivity.It then selects an appropriate variant for 2000 each method so that the total amount of points-to information is bounded,while utilizing the available space to maximize precision. Our experimental results demonstrate that SCALER achieves pre- 份导时8济付 dictable scalability for all the evaluated programs(e.g.,speedups can reach 10x for 2-object-sensitivity),while providing a precision dbugs pmd soot that matches or even exceeds that of the best alternative techniques. Figure 1:Comparison of running time(seconds)of 2-object sensitivity,2-type sensitivity,and context-insensitive analy- CCS CONCEPTS ses.The y-axis is truncated to 7000 seconds for readability Theory of computation-Program analysis; and all truncated cases reach the time budget,10800 seconds. KEYWORDS challenging to produce precise analysis results without sacrificing scalability [12.30.35].Thus,for decades,researchers have contin- static analysis,points-to analysis,Java ued to develop sophisticated pointer analysis techniques [2,14- ACM Reference Format: 16,18,22,24,25,32,33,37,38]. Yue Li,Tian Tan,Anders Moller,and Yannis Smaragdakis.2018.Scalability- One of the key mechanisms for achieving high analysis precision First Pointer Analysis with Self-Tuning Context-Sensitivity.In Proceedings is context sensitivity,which allows each program method to be of the 26th ACM Joint European Software Engineering Conference and Sympo- analyzed differently according to the context it is used in [17]. sium on the Foundations of Software Engineering (ESEC/FSE'18),November Context sensitivity has different variants,depending on the kind of 4-9.2018,Lake Buena Vista,FL,USA.ACM,New York,NY,USA,12 pages. context information used.For Java programs,object-sensitivity [25] https:/doi.org/10.1145/3236024.3236041 and type-sensitivity [32]have proven to be quite effective.The 1 INTRODUCTION former is strictly more precise but less efficient than the latter [15, 37].However,with any available variant,although the analysis can Pointer analysis is a family of static analysis techniques that provide gain in precision,scalability is known to be unpredictable [33,38]. a foundation for many other analyses and software engineering in the sense that programs very similar in size and other complexity tasks,such as program slicing [36,39].reflection analysis [19,31], metrics may have completely different scalability profiles. bug detection[13,26],security analysis [1,23],program verifica- Figure 1 shows time spent analyzing 10 real-world Java pro- tion [8,27],and program debugging and comprehension [5,21]. grams'under 2-object-sensitivity (2obj)[25],which is among The goal of pointer analysis is to statically compute a set of objects the most precise variants of context sensitivity,2-type-sensitivity (abstracted as their allocation sites)that a program variable may (2type)[32],and context-insensitivity(CI).We observe that point to during run time.Although stating this goal is simple,it is 2obj is not scalable for 6 out of 10 programs within 3 hours Permission to make digital or hard copies of all or part of this work for personal or while it can finish running for 3 programs within 5 minutes; classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation program size is far from a reliable predictor-for example,jython on the first page.Copyrights for components of this work owned by others than the (12 718 methods)is smaller than briss(26 582 methods),how author(s)must be honored.Abstracting with credit is permitted To copy otherwise.or republish,to post on servers or to redistribute to lists,requires prior specific permission ever,2type is not scalable for the former but scalable for the and/or a fee.Request permissions from permissions@acm.org. latter; ESEC/FSE '18.November 4-9,2018,Lake Buena Vista,FL.USA 2018 Copyright held by the owner/author(s).Publication rights licensed to ACM ACM1SBN978-1-4503-5573-5/18/11..$15.00 IThese are all popular open-source applications,including the heaviest(jython and ttps:/doi.org/10.1145/3236024.3236041 eclipse)of the DaCapo benchmarks [3]. 129Scalability-First Pointer Analysis with Self-Tuning Context-Sensitivity Yue Li Aarhus University yueli@cs.au.dk Tian Tan Aarhus University tiantan@cs.au.dk Anders Mùller Aarhus University amoeller@cs.au.dk Yannis Smaragdakis University of Athens smaragd@di.uoa.gr ABSTRACT Context-sensitivity is important in pointer analysis to ensure high precision, but existing techniques suffer from unpredictable scalability. Many variants of context-sensitivity exist, and it is difficult to choose one that leads to reasonable analysis time and obtains high precision, without running the analysis multiple times. We present the Scaler framework that addresses this problem. Scaler efficiently estimates the amount of points-to information that would be needed to analyze each method with different variants of context-sensitivity. It then selects an appropriate variant for each method so that the total amount of points-to information is bounded, while utilizing the available space to maximize precision. Our experimental results demonstrate that Scaler achieves predictable scalability for all the evaluated programs (e.g., speedups can reach 10x for 2-object-sensitivity), while providing a precision that matches or even exceeds that of the best alternative techniques. CCS CONCEPTS · Theory of computation → Program analysis; KEYWORDS static analysis, points-to analysis, Java ACM Reference Format: Yue Li, Tian Tan, Anders Mùller, and Yannis Smaragdakis. 2018. ScalabilityFirst Pointer Analysis with Self-Tuning Context-Sensitivity. In Proceedings of the 26th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE ’18), November 4ś9, 2018, Lake Buena Vista, FL, USA. ACM, New York, NY, USA, 12 pages. https://doi.org/10.1145/3236024.3236041 1 INTRODUCTION Pointer analysis is a family of static analysis techniques that provide a foundation for many other analyses and software engineering tasks, such as program slicing [36, 39], reflection analysis [19, 31], bug detection [13, 26], security analysis [1, 23], program verification [8, 27], and program debugging and comprehension [5, 21]. The goal of pointer analysis is to statically compute a set of objects (abstracted as their allocation sites) that a program variable may point to during run time. Although stating this goal is simple, it is Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from permissions@acm.org. ESEC/FSE ’18, November 4ś9, 2018, Lake Buena Vista, FL, USA © 2018 Copyright held by the owner/author(s). Publication rights licensed to ACM. ACM ISBN 978-1-4503-5573-5/18/11. . . $15.00 https://doi.org/10.1145/3236024.3236041 285 2458 53 93 5374 95 960 289 2950 45 54 1203 228 48 135 49 117 112 22 22 67 994 0 1000 2000 3000 4000 5000 6000 2obj 2type CI timeout (>10800) Figure 1: Comparison of running time (seconds) of 2-object sensitivity, 2-type sensitivity, and context-insensitive analyses. The y-axis is truncated to 7000 seconds for readability, and all truncated cases reach the time budget, 10800 seconds. challenging to produce precise analysis results without sacrificing scalability [12, 30, 35]. Thus, for decades, researchers have continued to develop sophisticated pointer analysis techniques [2, 14ś 16, 18, 22, 24, 25, 32, 33, 37, 38]. One of the key mechanisms for achieving high analysis precision is context sensitivity, which allows each program method to be analyzed differently according to the context it is used in [17]. Context sensitivity has different variants, depending on the kind of context information used. For Java programs, object-sensitivity [25] and type-sensitivity [32] have proven to be quite effective. The former is strictly more precise but less efficient than the latter [15, 37]. However, with any available variant, although the analysis can gain in precision, scalability is known to be unpredictable [33, 38], in the sense that programs very similar in size and other complexity metrics may have completely different scalability profiles. Figure 1 shows time spent analyzing 10 real-world Java programs1 under 2-object-sensitivity (2obj) [25], which is among the most precise variants of context sensitivity, 2-type-sensitivity (2type) [32], and context-insensitivity (CI). We observe that • 2obj is not scalable for 6 out of 10 programs within 3 hours, while it can finish running for 3 programs within 5 minutes; • program size is far from a reliable predictorÐfor example, jython (12 718 methods) is smaller than briss (26 582 methods), however, 2type is not scalable for the former but scalable for the latter; 1These are all popular open-source applications, including the heaviest (jython and eclipse) of the DaCapo benchmarks [3]. 129