正在加载图片...
ESEC/FSE'18.November 4-9,2018,Lake Buena Vista,FL,USA Yue Li,Tian Tan,Anders Moller,and Yannis Smaragdakis field points-to set over all fields)to define two heuristics,result- SCALER's lightweight pre-analysis,the learning phase is heavy and ing in two introspective analyses,IntroA and IntroB,which are costs 54 hours in the Jeong et al.experimental setting. compared with SCALER in Section 6.2.Benefiting from the new in- Generally,a machine learning approach is sensitive to the train- sight of SCALER(TST-based self-tuning context sensitivity).SCALER ing on input programs,and its learned results are usually difficult outperforms IntroB on both precision and scalability and achieves to explain,for example to discern why the learning algorithm se- the same level of scalability as IntroA,while being significantly lects a given context for a method.Instead,SCALER is a principled more precise.Moreover,the six different metrics in introspective rigorously-modeled,approach derived from simple insights;thus analysis need appropriate values to be set in advance to produce its guided results are tractable and interpretable,leading to more effective analysis results;SCALER's insights enable its methodology stable and uniform effectiveness. to be quite simple:it only needs one value(TST)for users to achieve Unlike conventional context-sensitive pointer analysis,which better precision or better efficiency as desired,resulting in better uses consecutive context elements for each context,the BEAN ap- usability in practice.Finally,introspective analysis is not one-shot: proach by Tan et al.[37]identifies and skips the redundant context using it will always incur a cost in precision,even if the program elements that are useless for improving the precision of context- could be analyzed more precisely.Therefore,its user will deploy it sensitive analysis.As a result,the saved space allows more precision- only after first attempting a precise analysis and failing. useful context elements to be involved to distinguish more contexts. Hassanshahi et al.[11]leverage similar metrics as introspective making the analysis more precise with a small efficiency overhead. analysis to guide selective object-sensitive pointer analysis for large Precision is the focus of BEAN while SCALER's is scalability.In addi codebases.However,their pre-analysis involves several phases(that tion,as explained in Section 4,rather than identifying redundant need different metrics and heuristics):a context-insensitive analysis precision-useless context elements,SCALER leverages the OAG from is first performed to extract the program kernel where a context- Tan et al [37]to compute the context numbers in advance. insensitive or fixed object-sensitive analysis is still not sufficiently MAHJONG [38],a recent heap abstraction for pointer analysis precise;then a fixed(heavy)object-sensitive pointer analysis is of Java,is also based on a cheap pre-analysis,like SCALER.It en- applied to the extracted(smaller)kernel to determine appropriate ables an allocation-site-based pointer analysis to run significantly context depth for each selected object.After these pre-analyses. faster while achieving nearly the same precision for type-dependent the selected object-sensitive information is used to guide the main clients,such as call graph construction.Differently,SCALER works pointer analysis which is demonstrated to work well for the Open- for general pointer analysis,including alias analysis(i.e.,not just JDK library.However,unlike introspective analysis [33]and SCALER, type-dependent clients)which cannot be handled by MAHJONG the overhead of the pre-analysis is uncertain,as it heavily relies effectively.In addition,SCALER is able to scale for trouble programs on the complexity of the extracted kernel,which further depends such as jython where even the very fast MAHJONG analysis fails on various metric values selected by users.Thus,it is unclear if the technique can exhibit general effectiveness for arbitrary Java 8 programs in practice CONCLUSIONS Both of the above approaches [11,33]involve metrics and heuris- Good scalability is hard to achieve for precise context-sensitive tics that are defined manually.An alternative is to use machine pointer analysis.To tackle this problem,we have introduced the learning techniques,as in the two approaches we describe next. SCALER framework,which automatically chooses a suitable context- Wei and Ryder [41]present an adaptive context-sensitive analy- sensitivity variant for each method in the given program,based on sis for JavaScript.They first use a machine learning algorithm to a fast,context-insensitive pre-analysis.The key insight is that it is obtain the relationship between some user-defined method char- possible to efficiently identify scalability-critical methods and that acteristics(extracted from a pre-analysis)and context-sensitivity scalability can be predicted using the ideas of scalability thresholds choice(1-call-site-,1-object-,or 1-parameter-sensitivity),and ex- and total scalability thresholds.The focus of SCALER is scalability, press the results as a decision tree.Based on domain knowledge but at the same time it aims to maximize precision,relative to a the decision tree is further manually adjusted to produce heuristics given total scalability threshold that can be selected based on the that are easy to interpret while the classifications can still maintain available memory. good accuracy.Finally,based on the heuristics,methods are ana The experimental evaluation of SCALER demonstrates that it is lyzed with different context sensitivity,resulting in better precision able to achieve extremely good scalability while producing highly achieved than single context-sensitive analysis. precise points-to results,in one shot,regardless of the programs Jeong et al.[14]propose a data-driven approach to guiding con- being analyzed.This may directly benefit many other program text sensitivity for Java.Unlike SCALER,where various kinds of analyses and software engineering tools that require scalable and context sensitivity with different lengths are applied to different precise pointer analysis.Moreover,we expect the ideas behind methods,only a single kind of context sensitivity is applied to the SCALER may help other kinds of static analyses to become more program,and each method is finally assigned an appropriate context scalable with good precision for real-world programs. length,including zero (i.e.,context insensitivity).As deep contexts are finally properly applied to only a subset of the methods,more ACKNOWLEDGMENTS efficient context-sensitive analysis can be achieved with still good precision.To select a context length for each method,25 metrics This work was supported by the European Research Council(ERC) (atomic features)are selected and a machine learning approach is under the FP7 and Horizon 2020 research and innovation programs used to learn heuristics based on these metrics.However,unlike (grant agreements 307334 and 647544) 138ESEC/FSE ’18, November 4–9, 2018, Lake Buena Vista, FL, USA Yue Li, Tian Tan, Anders Møller, and Yannis Smaragdakis field points-to set over all fields) to define two heuristics, result￾ing in two introspective analyses, IntroA and IntroB, which are compared with Scaler in Section 6.2. Benefiting from the new in￾sight of Scaler (TST-based self-tuning context sensitivity), Scaler outperforms IntroB on both precision and scalability and achieves the same level of scalability as IntroA, while being significantly more precise. Moreover, the six different metrics in introspective analysis need appropriate values to be set in advance to produce effective analysis results; Scaler’s insights enable its methodology to be quite simple: it only needs one value (TST) for users to achieve better precision or better efficiency as desired, resulting in better usability in practice. Finally, introspective analysis is not one-shot: using it will always incur a cost in precision, even if the program could be analyzed more precisely. Therefore, its user will deploy it only after first attempting a precise analysis and failing. Hassanshahi et al. [11] leverage similar metrics as introspective analysis to guide selective object-sensitive pointer analysis for large codebases. However, their pre-analysis involves several phases (that need different metrics and heuristics): a context-insensitive analysis is first performed to extract the program kernel where a context￾insensitive or fixed object-sensitive analysis is still not sufficiently precise; then a fixed (heavy) object-sensitive pointer analysis is applied to the extracted (smaller) kernel to determine appropriate context depth for each selected object. After these pre-analyses, the selected object-sensitive information is used to guide the main pointer analysis which is demonstrated to work well for the Open￾JDK library. However, unlike introspective analysis [33] and Scaler, the overhead of the pre-analysis is uncertain, as it heavily relies on the complexity of the extracted kernel, which further depends on various metric values selected by users. Thus, it is unclear if the technique can exhibit general effectiveness for arbitrary Java programs in practice. Both of the above approaches [11, 33] involve metrics and heuris￾tics that are defined manually. An alternative is to use machine learning techniques, as in the two approaches we describe next. Wei and Ryder [41] present an adaptive context-sensitive analy￾sis for JavaScript. They first use a machine learning algorithm to obtain the relationship between some user-defined method char￾acteristics (extracted from a pre-analysis) and context-sensitivity choice (1-call-site-, 1-object-, or 1-parameter-sensitivity), and ex￾press the results as a decision tree. Based on domain knowledge, the decision tree is further manually adjusted to produce heuristics that are easy to interpret while the classifications can still maintain good accuracy. Finally, based on the heuristics, methods are ana￾lyzed with different context sensitivity, resulting in better precision achieved than single context-sensitive analysis. Jeong et al. [14] propose a data-driven approach to guiding con￾text sensitivity for Java. Unlike Scaler, where various kinds of context sensitivity with different lengths are applied to different methods, only a single kind of context sensitivity is applied to the program, and each method is finally assigned an appropriate context length, including zero (i.e., context insensitivity). As deep contexts are finally properly applied to only a subset of the methods, more efficient context-sensitive analysis can be achieved with still good precision. To select a context length for each method, 25 metrics (atomic features) are selected and a machine learning approach is used to learn heuristics based on these metrics. However, unlike Scaler’s lightweight pre-analysis, the learning phase is heavy and costs 54 hours in the Jeong et al. experimental setting. Generally, a machine learning approach is sensitive to the train￾ing on input programs, and its learned results are usually difficult to explain, for example to discern why the learning algorithm se￾lects a given context for a method. Instead, Scaler is a principled, rigorously-modeled, approach derived from simple insights; thus its guided results are tractable and interpretable, leading to more stable and uniform effectiveness. Unlike conventional context-sensitive pointer analysis, which uses consecutive context elements for each context, the Bean ap￾proach by Tan et al. [37] identifies and skips the redundant context elements that are useless for improving the precision of context￾sensitive analysis. As a result, the saved space allows more precision￾useful context elements to be involved to distinguish more contexts, making the analysis more precise with a small efficiency overhead. Precision is the focus of Bean while Scaler’s is scalability. In addi￾tion, as explained in Section 4, rather than identifying redundant precision-useless context elements, Scaler leverages the OAG from Tan et al. [37] to compute the context numbers in advance. Mahjong [38], a recent heap abstraction for pointer analysis of Java, is also based on a cheap pre-analysis, like Scaler. It en￾ables an allocation-site-based pointer analysis to run significantly faster while achieving nearly the same precision for type-dependent clients, such as call graph construction. Differently, Scaler works for general pointer analysis, including alias analysis (i.e., not just type-dependent clients) which cannot be handled by Mahjong effectively. In addition, Scaler is able to scale for trouble programs such as jython where even the very fast Mahjong analysis fails. 8 CONCLUSIONS Good scalability is hard to achieve for precise context-sensitive pointer analysis. To tackle this problem, we have introduced the Scaler framework, which automatically chooses a suitable context￾sensitivity variant for each method in the given program, based on a fast, context-insensitive pre-analysis. The key insight is that it is possible to efficiently identify scalability-critical methods and that scalability can be predicted using the ideas of scalability thresholds and total scalability thresholds. The focus of Scaler is scalability, but at the same time it aims to maximize precision, relative to a given total scalability threshold that can be selected based on the available memory. The experimental evaluation of Scaler demonstrates that it is able to achieve extremely good scalability while producing highly precise points-to results, in one shot, regardless of the programs being analyzed. This may directly benefit many other program analyses and software engineering tools that require scalable and precise pointer analysis. Moreover, we expect the ideas behind Scaler may help other kinds of static analyses to become more scalable with good precision for real-world programs. ACKNOWLEDGMENTS This work was supported by the European Research Council (ERC) under the FP7 and Horizon 2020 research and innovation programs (grant agreements 307334 and 647544). 138
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有