正在加载图片...
147:16 Tian Tan,Yue Li,Xiaoxing Ma,Chang Xu,and Yannis Smaragdakis selected methods(with the others being analyzed context-insensitively),compared to ZIPPER, ZIPPERe substantially improves efficiency with similar precision. SCALER [Li et al.2018b]leverages the object allocation graph in [Tan et al.2016]to estimate the context-sensitive points-to sizes that would be needed for each method.Then it selects an appropriate context-sensitivity variant(2obj,2type,1type or CI)for each method,while keeping the overall points-to size bounded(to a quantity that represents the memory capacity available for running the analysis),resulting in good scalability. For a given program P,BATON applies the above three approaches to analyze P to obtain their context-sensitivity variants selected for each method of P.Based on these results,BATON produces new context-sensitivity configurations according to the Unity and Relay principles.In particular, in Relay,we run CoLLECTION,ZIPPER and SCALER in this order.We switch between Relay-o1 and Relay-02,choosing the more precise option(Relay-o1)if it scales,per the discussion of Section 3. As all context-sensitivity variants selected by CoLLECTION,ZIPPER and SCALER are comparable in precision,by Theorems 4.3 and 4.6,we can expect BAToN to yield more precise results than all of them in practice.This will be further validated in Section 6. 6 EVALUATION This section examines how BAroN performs when addressing the challenging research problem raised in Section 1:"Given reasonably long time,can we achieve precise pointer analysis results for hard-to-analyze programs,for which traditional context-sensitive analyses fail to scale,and selective context-sensitive approaches scale but with limited precision?".We investigate the following research questions: RQ1.Given that BATON(Unity)picks the "most precise configuration"based on the input selective approaches,how does this design scale for hard-to-analyze programs in practice? How does it fare against state-of-the-art analyses in terms of the precision gain that we aim for? RQ2.In the cases for which BAToN(Unity)fails to scale,how does BAToN(Relay),as the second punch of Unity-Relay,perform in terms of scalability and precision? Experimental Settings.We conduct all experiments on a machine with an Intel Xeon 2.2GHz CPU and 128GB of memory.All pointer analyses are performed on Doop [Bravenboer and Smaragdakis 2009],the state-of-the-art pointer analysis framework for Java(with the version published as the artifact of [Smaragdakis et al.2014]).All pointer analyses adopt the same reflection handling configuration for the same program.Specifically,for each program,we first run the dynamic reflection analysis tool TAMIFLEX [Bodden et al.2011]and its results are used if TAMIFLEx analyzes the program successfully;otherwise(if TAMIFLEX throws exceptions),we use Door's default reflection analysis setting.Time budget is set to 2 hours for each analysis.In evaluation,all benchmarks are analyzed with a large Java library:OpenJDK 1.6.0_24,which is widely used in recent work [Jeon et al.2019,2018;Li et al.2018a,2020;Minseok Jeon and Oh 2020]. Hard-to-analyze Programs.We consider 13 large and complex Java programs as our benchmarks, including all the hard-to-analyze programs in the standard DaCapo benchmarks [Blackburn et al. 2006]and recent literature for Java pointer analysis [Jeon et al.2019,2018;Jeong et al.2017;Kastrinis and Smaragdakis 2013;Li et al.2018b,2020;Minseok Jeon and Oh 2020;Smaragdakis et al.2014], for which traditional 2obj fails to scale within time budget(2 hours).To our knowledge,this is the largest set of hard-to-analyze programs evaluated in related literature. Precision Metrics.To thoroughly measure precision,we consider the most complete set of preci- sion metrics that were used in recent literature [Jeon et al.2019,2018;Jeong et al.2017;Kastrinis and Proc.ACM Program.Lang.,Vol.5.No.OOPSLA,Article 147.Publication date:October 2021.147:16 Tian Tan, Yue Li, Xiaoxing Ma, Chang Xu, and Yannis Smaragdakis selected methods (with the others being analyzed context-insensitively), compared to Zipper, Zipper𝑒 substantially improves efficiency with similar precision. Scaler [Li et al. 2018b] leverages the object allocation graph in [Tan et al. 2016] to estimate the context-sensitive points-to sizes that would be needed for each method. Then it selects an appropriate context-sensitivity variant (2obj, 2type, 1type or CI) for each method, while keeping the overall points-to size bounded (to a quantity that represents the memory capacity available for running the analysis), resulting in good scalability. For a given program P, Baton applies the above three approaches to analyze P to obtain their context-sensitivity variants selected for each method of P. Based on these results, Baton produces new context-sensitivity configurations according to the Unity and Relay principles. In particular, in Relay, we run Collection, Zipper𝑒 and Scaler in this order. We switch between Relay-o1 and Relay-o2, choosing the more precise option (Relay-o1) if it scales, per the discussion of Section 3. As all context-sensitivity variants selected by Collection, Zipper𝑒 and Scaler are comparable in precision, by Theorems 4.3 and 4.6, we can expect Baton to yield more precise results than all of them in practice. This will be further validated in Section 6. 6 EVALUATION This section examines how Baton performs when addressing the challenging research problem raised in Section 1: łGiven reasonably long time, can we achieve precise pointer analysis results for hard-to-analyze programs, for which traditional context-sensitive analyses fail to scale, and selective context-sensitive approaches scale but with limited precision?ž. We investigate the following research questions: RQ1. Given that Baton (Unity) picks the łmost precise configurationž based on the input selective approaches, how does this design scale for hard-to-analyze programs in practice? How does it fare against state-of-the-art analyses in terms of the precision gain that we aim for? RQ2. In the cases for which Baton (Unity) fails to scale, how does Baton (Relay), as the second punch of Unity-Relay, perform in terms of scalability and precision? Experimental Settings. We conduct all experiments on a machine with an Intel Xeon 2.2GHz CPU and 128GB of memory. All pointer analyses are performed on Doop [Bravenboer and Smaragdakis 2009], the state-of-the-art pointer analysis framework for Java (with the version published as the artifact of [Smaragdakis et al. 2014]). All pointer analyses adopt the same reflection handling configuration for the same program. Specifically, for each program, we first run the dynamic reflection analysis tool TamiFlex [Bodden et al. 2011] and its results are used if TamiFlex analyzes the program successfully; otherwise (if TamiFlex throws exceptions), we use Doop’s default reflection analysis setting. Time budget is set to 2 hours for each analysis. In evaluation, all benchmarks are analyzed with a large Java library: OpenJDK 1.6.0_24, which is widely used in recent work [Jeon et al. 2019, 2018; Li et al. 2018a, 2020; Minseok Jeon and Oh 2020]. Hard-to-analyze Programs. We consider 13 large and complex Java programs as our benchmarks, including all the hard-to-analyze programs in the standard DaCapo benchmarks [Blackburn et al. 2006] and recent literature for Java pointer analysis [Jeon et al. 2019, 2018; Jeong et al. 2017; Kastrinis and Smaragdakis 2013; Li et al. 2018b, 2020; Minseok Jeon and Oh 2020; Smaragdakis et al. 2014], for which traditional 2obj fails to scale within time budget (2 hours). To our knowledge, this is the largest set of hard-to-analyze programs evaluated in related literature. Precision Metrics. To thoroughly measure precision, we consider the most complete set of preci￾sion metrics that were used in recent literature [Jeon et al. 2019, 2018; Jeong et al. 2017; Kastrinis and Proc. ACM Program. Lang., Vol. 5, No. OOPSLA, Article 147. Publication date: October 2021
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有