正在加载图片...
141:14 Yue Li,Tian Tan,Anders Moller,and Yannis Smaragdakis adopted in recent literature [Hassanshahi et al.2017;Jeong et al.2017;Kastrinis and Smaragdakis 2013;Scholz et al.2016;Smaragdakis et al.2013,2014;Tan et al.2017;Thiessen and Lhotak 2017] and analysis tools,including popular static analysis frameworks for Android [Arzt et al.2014; Gordon et al.2015].Relative to other k-object-sensitive analyses,2obj is significantly more precise than lobj [Kastrinis and Smaragdakis 2013;Smaragdakis et al.2011],and 3obj does not scale for most DaCapo benchmarks [Tan et al.2017]. In RQ2,we compare ZIPPER with the introspective analysis of Smaragdakis et al.[2014],which is the most closely related state-of-the-art analysis that employs context sensitivity only for a subset of the methods.These methods are selected by a pre-analysis according to two heuristics(the pre-analysis is also based on a fast context-insensitive pointer analysis,like ZIPPER),resulting in two variants of introspective analyses,IntroA and IntroB.(The naming and heuristics are from Smaragdakis et al.[2014].The Doop integration of ZIPPER is using the version published for the artifact evaluation process of PLDI'14,which contains the exact setup for these algorithms,for direct comparison.)Generally,IntroA is faster but less precise than IntroB. In the DaCapo benchmarks,2obj fails to scale for jython and hsqldb within 1.5 hours.ZIPPER also cannot help scale for these two known problematic benchmarks [Kastrinis and Smaragdakis 2013;Smaragdakis et al.2011;Tan et al.2016,2017],as,unlike the introspective analysis of Smarag- dakis et al.[2014],ZIPPER is designed to keep most of the analysis precision:its precision-guided principle prevents it from further removing more contexts,since that could degrade precision Regarding introspective analysis,IntroB also fails to scale for jython but scales for hsqldb;IntroA scales for both but only achieves precision slightly better than a context-insensitive analysis.Con- sequently,to provide an observable precision baseline(i.e.,the most precise results achieved by 2obj),we consider the remaining five large DaCapo benchmarks for which 2obj is scalable.We will examine how ZIPPER performs on the smaller,trivially-scalable benchmarks in Section 4.4. 4.1 RQ1:Precision and Efficiency of ZIPPER-Guided Pointer Analysis In this section,we first examine the precision and efficiency of ZIPPER-guided pointer analysis by comparing it with 2obj as explained above,and then show the overhead of running ZIPPER itself.As a conventional context-sensitive pointer analysis,to produce high precision,2obj applies context sensitivity to each method of the program indiscriminately.This is still the mainstream context-sensitivity scheme deployed in most pointer analysis frameworks for Java [Bravenboer and Smaragdakis 2009;Naik et al.2006;WALA 2018])and Android [Arzt et al.2014;Gordon et al.2015]. Table 1 shows the results of all analyses.Each program has five rows of data,respectively representing context-insensitive pointer analysis(ci),conventional object-sensitive pointer analysis (2obj),ZIPPER(zipper-2obj),and two introspective pointer analyses(introA-2obj and introB-2obj). The last two analyses will be discussed in Section 4.2. 4.1.1 How Much Precision of a Conventional Analysis Is Preserved by ZIPPER.To measure precision, we consider four independently useful client analyses,(subsets of which)also used as the precision metrics in past literature [Jeong et al.2017;Kastrinis and Smaragdakis 2013;Lhotak and Hendren 2006;Smaragdakis et al.2014;Sridharan and Bodik 2006;Tan et al.2017]:a cast-resolution analysis (metric:the number of cast operations that may fail,denoted #fail-cast),a devirtualization analysis (metric:the number of virtual call sites that cannot be disambiguated into monomorphic calls, denoted #poly-call),a method reachability analysis (metric:the number of reachable methods, denoted #reach-mtd),and a call-graph construction analysis(metric:the number of call graph edges,denoted #call-edge).These metrics should give a thorough idea of analysis precision for useful clients.The results are shown in the last four columns in Table 1.In all cases,lower is better. Proc.ACM Program.Lang.,Vol.2,No.OOPSLA,Article 141.Publication date:November 2018.141:14 Yue Li, Tian Tan, Anders Mùller, and Yannis Smaragdakis adopted in recent literature [Hassanshahi et al. 2017; Jeong et al. 2017; Kastrinis and Smaragdakis 2013; Scholz et al. 2016; Smaragdakis et al. 2013, 2014; Tan et al. 2017; Thiessen and Lhoták 2017] and analysis tools, including popular static analysis frameworks for Android [Arzt et al. 2014; Gordon et al. 2015]. Relative to other k-object-sensitive analyses, 2obj is significantly more precise than 1obj [Kastrinis and Smaragdakis 2013; Smaragdakis et al. 2011], and 3obj does not scale for most DaCapo benchmarks [Tan et al. 2017]. In RQ2, we compare Zipper with the introspective analysis of Smaragdakis et al. [2014], which is the most closely related state-of-the-art analysis that employs context sensitivity only for a subset of the methods. These methods are selected by a pre-analysis according to two heuristics (the pre-analysis is also based on a fast context-insensitive pointer analysis, like Zipper), resulting in two variants of introspective analyses, IntroA and IntroB. (The naming and heuristics are from Smaragdakis et al. [2014]. The Doop integration of Zipper is using the version published for the artifact evaluation process of PLDI’14, which contains the exact setup for these algorithms, for direct comparison.) Generally, IntroA is faster but less precise than IntroB. In the DaCapo benchmarks, 2obj fails to scale for jython and hsqldb within 1.5 hours. Zipper also cannot help scale for these two known problematic benchmarks [Kastrinis and Smaragdakis 2013; Smaragdakis et al. 2011; Tan et al. 2016, 2017], as, unlike the introspective analysis of Smarag￾dakis et al. [2014], Zipper is designed to keep most of the analysis precision: its precision-guided principle prevents it from further removing more contexts, since that could degrade precision. Regarding introspective analysis, IntroB also fails to scale for jython but scales for hsqldb; IntroA scales for both but only achieves precision slightly better than a context-insensitive analysis. Con￾sequently, to provide an observable precision baseline (i.e., the most precise results achieved by 2obj), we consider the remaining five large DaCapo benchmarks for which 2obj is scalable. We will examine how Zipper performs on the smaller, trivially-scalable benchmarks in Section 4.4. 4.1 RQ1: Precision and Efficiency of Zipper-Guided Pointer Analysis In this section, we first examine the precision and efficiency of Zipper-guided pointer analysis by comparing it with 2obj as explained above, and then show the overhead of running Zipper itself. As a conventional context-sensitive pointer analysis, to produce high precision, 2obj applies context sensitivity to each method of the program indiscriminately. This is still the mainstream context-sensitivity scheme deployed in most pointer analysis frameworks for Java [Bravenboer and Smaragdakis 2009; Naik et al. 2006; WALA 2018]) and Android [Arzt et al. 2014; Gordon et al. 2015]. Table 1 shows the results of all analyses. Each program has five rows of data, respectively representing context-insensitive pointer analysis (ci), conventional object-sensitive pointer analysis (2obj), Zipper (zipper-2obj), and two introspective pointer analyses (introA-2obj and introB-2obj). The last two analyses will be discussed in Section 4.2. 4.1.1 How Much Precision of a Conventional Analysis Is Preserved by Zipper. To measure precision, we consider four independently useful client analyses, (subsets of which) also used as the precision metrics in past literature [Jeong et al. 2017; Kastrinis and Smaragdakis 2013; Lhoták and Hendren 2006; Smaragdakis et al. 2014; Sridharan and Bodík 2006; Tan et al. 2017]: a cast-resolution analysis (metric: the number of cast operations that may fail, denoted #fail-cast), a devirtualization analysis (metric: the number of virtual call sites that cannot be disambiguated into monomorphic calls, denoted #poly-call), a method reachability analysis (metric: the number of reachable methods, denoted #reach-mtd), and a call-graph construction analysis (metric: the number of call graph edges, denoted #call-edge). These metrics should give a thorough idea of analysis precision for useful clients. The results are shown in the last four columns in Table 1. In all cases, lower is better. Proc. ACM Program. Lang., Vol. 2, No. OOPSLA, Article 141. Publication date: November 2018
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有