正在加载图片...
Y.Li,T.Tan,Y.Zhang,and J.Xue XX:7 TAILOR Program rogram Analysis Tools SCEXT) SC SC-Based Data-Flow Analysis (SCDFA) Figure 4 Overview of TAILOR(with all the components implemented in this paper in blue). is unsound for GICFG (as it is designed for program debugging and understanding only). TAILOR computes T(SC)in two stages,SC Extension and SC-based Data-Flow Analysis. In the first stage(Section 3.2),we exploit"branch correlations"in object-oriented programs to lengthen a given SC in order to avoid some infeasible paths that would otherwise be introduced in the second stage.In the second stage(Section 3.1),we compute T(SC)by solving a data-flow problem in order to avoid unrealizable paths efficiently.This design allows TAILOR to achieve good precision and scale well to large object-oriented programs. 3.1 SC-Based Data-Flow Analysis We compute T(SC)by solving flow-and context-sensitively an IFDS(Interprocedural Finite Distributive Subset)data-flow problem [38],efficiently on GIcrG,via graph reachability. This formulation of our SC-based data-flow analysis,denoted SCDFA,is significant for three reasons.(1)With flow-sensitivity,SCDFA can filter out imprecisely ordered statement sequences in a SC,as many static analyses from which SCs are deduced are flow-insensitive in order to be scalable.Conversely,precisely ordered statement sequences in SC also enable SCDFA to filter out irrelevant control flows in tailoring a program.This mutually beneficial process improves the precision of both parts,therefore avoiding the unnecessary complexity faced for solving both as one problem.(2)With context-sensitivity,we avoid introducing unrealizable paths with mismatched calls and returns,which is critical for achieving precision in computing T(C).(3)With an IFDS formulation,SCDFA can scale well to reasonably large object-oriented programs.In particular,(full)context-sensitivity can be realized more efficiently by solving a CFL-reachability problem over a simplified balanced-parentheses language.As an IFDS problem,SCDFA has a time complexity of O(ED3),where E is the number of edges in GIcFG and D is the size of the set of SCrelated data-flow facts used(i.e., the set of suffixes of sequences in SC,as will be clear below and defined in Section 4). SCDFA starts with a bottom-up pass(BU)and finishes with a top-down pass(TD),with both performed(fully)context-sensitively.By using the example program from Figure 2, we first explain the functionalities of two passes and then examine briefy how context- sensitivity is realized efficiently in the IFDS framework.Suppose we are given a SC as line 2:fw new FileWriter()-line 27:out.close()-line 12:fw.write().For convenience,we write it as SC:n-cw.In its GIcFG,the allocation site for Filewriter at line 2 is replicated in the two constructors of Driver.We identify the one in the single-arg constructor as line 2(denoted by n)and the one in the non-arg constructor as line 2'(denoted byn.As a result,we finally have a two-sequence SCo={n→c→w,n'→c→w}. BU and TD Passes Both passes operate on GICFG,as shown in Figure 5,for our example. As in any data-flow analysis,all control flow edges in GICFG are treated non-deterministically executable.Let ENTRY be the entry node of GICFG and EXIT the node marking the point of interest w at line 12 in SC.Conceptually,if a sequence SSC,which is ncw or n'cw, lies on a path,then BU must see a suffix of S at every node n on the path backwards from EXIT and TD must see the corresponding prefix of S at n forwards from ENTRY. EC00P2016Y. Li, T. Tan, Y. Zhang, and J. Xue XX:7 Extended SC Bottom-Up Pass Top-Down Pass TAILOR Tailored Program E.g., API Protocol Analysis Analysis Tools SC Extension Program ICFG Construction GICFG SC-Based Data-Flow Analysis (SCDFA) SC (SCEXT) Figure 4 Overview of Tailor (with all the components implemented in this paper in blue). is unsound for GICFG (as it is designed for program debugging and understanding only). Tailor computes T (SC) in two stages, SC Extension and SC-based Data-Flow Analysis. In the first stage (Section 3.2), we exploit “branch correlations” in object-oriented programs to lengthen a given SC in order to avoid some infeasible paths that would otherwise be introduced in the second stage. In the second stage (Section 3.1), we compute T (SC) by solving a data-flow problem in order to avoid unrealizable paths efficiently. This design allows Tailor to achieve good precision and scale well to large object-oriented programs. 3.1 SC-Based Data-Flow Analysis We compute T (SC) by solving flow- and context-sensitively an IFDS (Interprocedural Finite Distributive Subset) data-flow problem [38], efficiently on GICFG, via graph reachability. This formulation of our SC-based data-flow analysis, denoted SCDFA, is significant for three reasons. (1) With flow-sensitivity, SCDFA can filter out imprecisely ordered statement sequences in a SC, as many static analyses from which SCs are deduced are flow-insensitive in order to be scalable. Conversely, precisely ordered statement sequences in SC also enable SCDFA to filter out irrelevant control flows in tailoring a program. This mutually beneficial process improves the precision of both parts, therefore avoiding the unnecessary complexity faced for solving both as one problem. (2) With context-sensitivity, we avoid introducing unrealizable paths with mismatched calls and returns, which is critical for achieving precision in computing T (SC). (3) With an IFDS formulation, SCDFA can scale well to reasonably large object-oriented programs. In particular, (full) context-sensitivity can be realized more efficiently by solving a CFL-reachability problem over a simplified balanced-parentheses language. As an IFDS problem, SCDFA has a time complexity of O(ED3 ), where E is the number of edges in GICFG and D is the size of the set of SC-related data-flow facts used (i.e., the set of suffixes of sequences in SC, as will be clear below and defined in Section 4). SCDFA starts with a bottom-up pass (BU) and finishes with a top-down pass (TD), with both performed (fully) context-sensitively. By using the example program from Figure 2, we first explain the functionalities of two passes and then examine briefly how context￾sensitivity is realized efficiently in the IFDS framework. Suppose we are given a SC as line 2 : fw = new FileWriter() → line 27 : out.close() → line 12 : fw.write(). For convenience, we write it as SC : n→c→w. In its GICFG, the allocation site for FileWriter at line 2 is replicated in the two constructors of Driver. We identify the one in the single-arg constructor as line 2 (denoted by n) and the one in the non-arg constructor as line 2 0 (denoted by n 0 ). As a result, we finally have a two-sequence SC w = {n→c→w, n0→c→w}. BU and TD Passes Both passes operate on GICFG, as shown in Figure 5, for our example. As in any data-flow analysis, all control flow edges in GICFG are treated non-deterministically executable. Let ENTRY be the entry node of GICFG and EXIT the node marking the point of interest w at line 12 in SC w. Conceptually, if a sequence S ∈ SC w, which is ncw or n 0 cw, lies on a path, then BU must see a suffix of S at every node n on the path backwards from EXIT and TD must see the corresponding prefix of S at n forwards from ENTRY. E COO P 2 0 1 6
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有