正在加载图片...
Equivalent AutomataType-Consistent Objects Sequential Automata Ao=(Q,∑,8,qo,「,Y)←→Co=(gHF,a,o,T,t)o-Rooted Field Points-to Graph A set of states-.- 0 →H… ......Aset of heap objects A set of input symbols… …∑ F..................Aset of field identifiers The next-state map:Q×∑→P(Q) The field points-to map:Hx F-P(H) The initial state............ 0 ...The object to be checked A set of output symbols ...…Aset of types The output map:Q→T The object-to-type map:H-T Figure 4.The mapping of a field points-to graph rooted at an object to a sequential automaton. Definition 2.2(MAHJONG's Heap Abstraction).Given the Therefore,we have reduced the problem of checking the quotient set,H/=,MAHJONG will merge all the objects in type-consistency of of and of to one of testing the equiva- the same equivalence class into one object. lence of their corresponding automata Aor and Aor,which is solvable by the Hopcroft-Karp algorithm [18]with minor Therefore,the key insight behind our new heap abstrac- tion is not to distinguish two (container)objects of the same modifications.The worst-case time complexity is O(x type if both containers store the objects of the same type at QargerD),which is almost linear in terms of |Qargerl,where all their corresponding nested sub-containers. Qarger is the set of states of the larger automaton [18] How do we check the type-consistency of two objects Example 2.6.Continuing from Example 2.5,we see easily efficiently,especially for large programs with a large number that of and oz are type-consistent (Figure 2)since their of heap objects,field names and class types?Enumerating all corresponding automata Ar and A are equivalent. the possible field access paths f as required in Definition 2.1, especially in the presence of cycles,may be exponential in 3.MAHJONG terms of the number of edges traversed [28,34].causing We first give an overview of MAHJONG that consists of four the pre-analysis to be too inefficient or even unscalable.We components(Section 3.1).We then describe each component describe a fast and elegant solution below. in detail (Sections 3.2-3.5).Finally,we discuss MAHJONG- 2.2.2 Merging Equivalent Automata based points-to analysis (Section 3.6). We transform the problem of checking the type-consistency 3.1 Overview of two objects into one of testing the equivalence of two au- As shown in Figure 5,MAHJONG takes the field points-to tomata.Figure 4 relates the field points-to graph rooted at graph(FPG)computed by a pre-analysis(Section 2.2.1)as an object o.Go=(H,F,a,o,T,T),to a 6-tuple sequential input and builds a heap abstraction(Definition 2.2)to be automatonA。=(Q,∑,d,go,T,y)[1],which is more gen- used by a subsequent points-to analysis.The pre-analysis is eral than a traditional(5-tuple)automaton.In fact,a 5-tuple fast but imprecise.by using Andersen's algorithm [4]with automaton can be turned into a 6-tuple automaton,if its ac- the allocation-site abstraction,context-insensitively.The cepting (acc)and non-accepting (non-acc)states are distin- subsequent points-to analysis will be more precise,usually guished by y:QI,where I=facc,non-acc}. performed context-sensitively,especially for object-oriented Example 2.5.Continuing from Example 2.2(Figure 2),the programs,based on the MAHJONG heap abstraction. automaton Aor for Gor =(H,F,a,o,T,t)is obtained MAHJONG iteratively picks a pair of objects of and of according to Figure 4.Similarly.Aor is constructed. □ with the same type T and merges them if they are type- consistent,until no such pair can be found.Given o and o The behavior ofAo,which can be an NFA (consisting of their corresponding NFAs,NFAr and NFAT,are first built multiple edges with the same label leaving a state),is: by using the NFA Builder.Then the two NFAs are converted BA。:*→P(T) into their equivalent DFAs,DFAT and DFAT,by using the DFA Converter.Next,the Automata Equivalence Checker If Ao finally reaches the states,s1,s2,...,sn,after having determines whether DFA?and DFA?are equivalent or not. read an input w in >"then BA (w)=UP1y[si]. Finally,the Heap Modeler outputs a new heap abstraction. Let of and of be two objects with the same type T.Let The detailed algorithms are given in Section 4. their automata Ao?and Ao be built as shown in Figure 4. of and of are type-consistent if,for every input w in *(1) 3.2 The NFA Builder BA(w)=BA(w)(Condition 1 of Definition 2.1)and (2(w)=1(Condition 2 of Definition 2.1). The NFA builder takes an object o,with the field points-to graph go rooted at o,and constructs a 6-tuple NFA A 281A A set of states A set of input symbols The next-state map: A set of output symbols The initial state The output map: Q ∑ q0 H F o T A set of heap objects A set of field identifiers The object to be checked A set of types The object-to-type map: H⟶ Q × ∑⟶ (Q) The field points-to map: × ⟶ ( ) Equivalent Automata Type-Consistent Objects δ Γ γ α Q ⟶Γ τ Sequential Automata Q ∑ q Γ = ( H, , , , F α o T, ) -Rooted Field Points-to Graph 0 o = ( , , , , δ , γ ) o τ o T P H F P H G Figure 4. The mapping of a field points-to graph rooted at an object to a sequential automaton. Definition 2.2 (MAHJONG’s Heap Abstraction). Given the quotient set, H / ≡, MAHJONG will merge all the objects in the same equivalence class into one object. Therefore, the key insight behind our new heap abstrac￾tion is not to distinguish two (container) objects of the same type if both containers store the objects of the same type at all their corresponding nested sub-containers. How do we check the type-consistency of two objects efficiently, especially for large programs with a large number of heap objects, field names and class types? Enumerating all the possible field access paths ¯f as required in Definition 2.1, especially in the presence of cycles, may be exponential in terms of the number of edges traversed [28, 34], causing the pre-analysis to be too inefficient or even unscalable. We describe a fast and elegant solution below. 2.2.2 Merging Equivalent Automata We transform the problem of checking the type-consistency of two objects into one of testing the equivalence of two au￾tomata. Figure 4 relates the field points-to graph rooted at an object o, Go = (H, F, α, o, T , τ), to a 6-tuple sequential automaton Ao = (Q, Σ, δ, qo, Γ, γ) [1], which is more gen￾eral than a traditional (5-tuple) automaton. In fact, a 5-tuple automaton can be turned into a 6-tuple automaton, if its ac￾cepting (acc) and non-accepting (non-acc) states are distin￾guished by γ : Q 7→ Γ, where Γ = {acc, non-acc}. Example 2.5. Continuing from Example 2.2 (Figure 2), the automaton Ao T 2 for Go T 2 = (H, F, α, oT 2 , T , τ) is obtained according to Figure 4. Similarly, Ao T 1 is constructed. The behavior of Ao, which can be an NFA (consisting of multiple edges with the same label leaving a state), is: βAo : Σ∗ → P(Γ) If Ao finally reaches the states, s1, s2, · · · , sn, after having read an input w in Σ ∗ , then βAo (w) = ∪ n i=1γ[si ]. Let o T 1 and o T 2 be two objects with the same type T. Let their automata Ao T 1 and Ao T 2 be built as shown in Figure 4. o T 1 and o T 2 are type-consistent if, for every input w in Σ ∗ , (1) βAoT 1 (w) = βAoT 2 (w) (Condition 1 of Definition 2.1) and (2) |βAoT 1 (w)| = 1 (Condition 2 of Definition 2.1). Therefore, we have reduced the problem of checking the type-consistency of o T 1 and o T 2 to one of testing the equiva￾lence of their corresponding automata Ao T 1 and Ao T 2 , which is solvable by the Hopcroft-Karp algorithm [18] with minor modifications. The worst-case time complexity is O(|Σ| × |Qlarger|), which is almost linear in terms of |Qlarger|, where Qlarger is the set of states of the larger automaton [18]. Example 2.6. Continuing from Example 2.5, we see easily that o T 1 and o T 2 are type-consistent (Figure 2) since their corresponding automata Ao T 1 and Ao T 2 are equivalent. 3. MAHJONG We first give an overview of MAHJONG that consists of four components (Section 3.1). We then describe each component in detail (Sections 3.2 – 3.5). Finally, we discuss MAHJONG￾based points-to analysis (Section 3.6). 3.1 Overview As shown in Figure 5, MAHJONG takes the field points-to graph (FPG) computed by a pre-analysis (Section 2.2.1) as input and builds a heap abstraction (Definition 2.2) to be used by a subsequent points-to analysis. The pre-analysis is fast but imprecise, by using Andersen’s algorithm [4] with the allocation-site abstraction, context-insensitively. The subsequent points-to analysis will be more precise, usually performed context-sensitively, especially for object-oriented programs, based on the MAHJONG heap abstraction. MAHJONG iteratively picks a pair of objects o T i and o T j with the same type T and merges them if they are type￾consistent, until no such pair can be found. Given o T i and o T j , their corresponding NFAs, NFAo T i and NFAo T j , are first built by using the NFA Builder. Then the two NFAs are converted into their equivalent DFAs, DFAo T i and DFAo T j , by using the DFA Converter. Next, the Automata Equivalence Checker determines whether DFAo T i and DFAo T j are equivalent or not. Finally, the Heap Modeler outputs a new heap abstraction. The detailed algorithms are given in Section 4. 3.2 The NFA Builder The NFA builder takes an object o, with the field points-to graph Go rooted at o, and constructs a 6-tuple NFA Ao = 281
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有