Making Pointer Analysis More Precise by Unleashing the Power of Selective Context Sensitivity 147:13 THEOREM 4.1(SOUNDNESS OF Unity).PA-Unitys is sound. PRoor.By rules in Figure 6,PA-Unitys is a standard Andersen-style pointer analysis,and the only difference from other selective context-sensitive pointer analyses is selectCS.Selecting different contexts for methods affects precision but not soundness [Jeon et al.2018;Jeong et al.2017;Li et al. 2018b,2020;Smaragdakis et al.2014].Hence PA-Unity is sound. 口 Following existing literature [Kastrinis and Smaragdakis 2013;Li et al.2018b;Lu and Xue 2019;Minseok Jeon and Oh 2020],contexts in points-to results are ignored when measuring the precision of context-sensitive pointer analyses.To this end,we use pipA(x)to represent the points-to set of variable x in pointer analysis PA,without contexts,e.g.,if pt(c,x)={(c,oi)}and pt(c'.x)={(c,oj)}in PA,then ptpa(x)=(oi.oj). Definition 4.2(Precision of Pointer Analysis).Given two sound pointer analyses PA and PA',PA is at most as precise than PA',denoted PA PA',if for each variable x:ptpA(x)2 ptpA(x). (We informally also use the convenient forms"less precise",as a synonym of"at most as precise", and"more precise"as a synonym of the converse."Less than or equally"and "more than or equally" would have been more accurate,but inconvenient.) We define the precision of a pointer analysis based on the points-to sets of each variable,as they are the most important results of pointer analysis and used by virtually all its clients. THEOREM 4.3 (PRECISION OF Unity).Given a set of selective approaches S,VSi E S:PA-Si 3 PA-UnityS. PRooF.By Equation(2).the cs selected by selectCS-unity for each method is always more precise than the one selected by any SiE S.This ensures that for each variable x in any PA-Si, ptpA-s,(x)2 ptpA-Unitys(x).Hence,VSi ES:PA-Si 3 PA-Unitys. 4.4 Relay Given a set of selective approaches S,in Relay,we run IS]pointer analysis passes to divide the scalability burden into different passes(Section 3.3).Now we formalize the C.S.selector(Figure 1) of the two options of Relay(Relay-o1 and Relay-02)for any passi. We define the C.S.selector of Relay-o1 for passi(denoted by selectCS-relay),as follows: selectCS-unity(m)if selectCSi(m)+CI selectCS-relay (m)= (3) CI otherwise For a method m,if selectCS(result of the i-th selective approach)selects a non-CI variant for m, then selectCS-relay leverages selectCS-unity to select the most precise cs,otherwise,CI is selected, as illustrated in Figure 3. In Relay-02,to ensure scalability in passi,as explained in Section 3,we use the same context selector function as S;: selectCS-relay (m)=selectCSi(m) (4) In each pass of Relay(except pass),we apply filtering on the points-to sets of variables based on the points-to results from the previous pass,to improve precision by reducing spurious data flows.Such filtering is formalized as the premises in boxes of [AssIGN],[LoAD]and [CALL](Figure 6), where ppt(x)denotes the points-to set of variable x(without contexts)in the previous analysis pass (ppt reads as "previous points-to set").We ignore contexts in ppt as the contexts of different passes may not be possible to match.These boxed premises ensure that if the previous analysis concludes that variable x does not point to object oi,then all (o:)will be filtered out from pt(x)in the Proc.ACM Program.Lang.,Vol.5,No.OOPSLA,Article 147.Publication date:October 2021.Making Pointer Analysis More Precise by Unleashing the Power of Selective Context Sensitivity 147:13 Theorem 4.1 (Soundness of Unity). PA-Unity𝑆 is sound. Proof. By rules in Figure 6, PA-Unity𝑆 is a standard Andersen-style pointer analysis, and the only difference from other selective context-sensitive pointer analyses is selectCS. Selecting different contexts for methods affects precision but not soundness [Jeon et al. 2018; Jeong et al. 2017; Li et al. 2018b, 2020; Smaragdakis et al. 2014]. Hence PA-Unity𝑆 is sound. □ Following existing literature [Kastrinis and Smaragdakis 2013; Li et al. 2018b; Lu and Xue 2019; Minseok Jeon and Oh 2020], contexts in points-to results are ignored when measuring the precision of context-sensitive pointer analyses. To this end, we use 𝑝𝑡PA(𝑥) to represent the points-to set of variable 𝑥 in pointer analysis PA, without contexts, e.g., if pt(𝑐, 𝑥) = {⟨𝑐 𝑖 , 𝑜𝑖⟩} and pt(𝑐 ′ , 𝑥) = {⟨𝑐 𝑗 , 𝑜 𝑗⟩} in PA, then 𝑝𝑡PA(𝑥) = {𝑜𝑖 , 𝑜 𝑗 }. Definition 4.2 (Precision of Pointer Analysis). Given two sound pointer analyses PA and PA′ , PA is at most as precise than PA′ , denoted PA ⪯ PA′ , if for each variable 𝑥: 𝑝𝑡PA(𝑥) ⊇ 𝑝𝑡PA′ (𝑥). (We informally also use the convenient forms łless precisež, as a synonym of łat most as precisež, and łmore precisež as a synonym of the converse. łLess than or equallyž and łmore than or equallyž would have been more accurate, but inconvenient.) We define the precision of a pointer analysis based on the points-to sets of each variable, as they are the most important results of pointer analysis and used by virtually all its clients. Theorem 4.3 (Precision of Unity). Given a set of selective approaches S, ∀𝑆𝑖 ∈ 𝑆 : PA-𝑆𝑖 ⪯ PA-Unity𝑆 . Proof. By Equation (2), the 𝑐𝑠 selected by selectCS-unity𝑆 for each method is always more precise than the one selected by any 𝑆𝑖 ∈ 𝑆. This ensures that for each variable 𝑥 in any PA-𝑆𝑖 , 𝑝𝑡PA-𝑆𝑖 (𝑥) ⊇ 𝑝𝑡PA-Unity𝑆 (𝑥). Hence, ∀𝑆𝑖 ∈ 𝑆 : PA-𝑆𝑖 ⪯ PA-Unity𝑆 . □ 4.4 Relay Given a set of selective approaches 𝑆, in Relay, we run |𝑆 | pointer analysis passes to divide the scalability burden into different passes (Section 3.3). Now we formalize the C.S. selector (Figure 1) of the two options of Relay (Relay-o1 and Relay-o2) for any pass𝑖 . We define the C.S. selector of Relay-o1 for pass𝑖 (denoted by selectCS-relay𝑆 𝑖 ), as follows: selectCS-relay𝑆 𝑖 (𝑚) = ( selectCS-unity𝑆 (𝑚) if selectCS𝑖(𝑚) ≠ CI CI otherwise (3) For a method 𝑚, if selectCS𝑖 (result of the 𝑖-th selective approach) selects a non-CI variant for 𝑚, then selectCS-relay𝑆 𝑖 leverages selectCS-unity𝑆 to select the most precise 𝑐𝑠, otherwise, CI is selected, as illustrated in Figure 3. In Relay-o2, to ensure scalability in pass𝑖 , as explained in Section 3, we use the same context selector function as 𝑆𝑖 : selectCS-relay𝑆 𝑖 (𝑚) = selectCS𝑖(𝑚) (4) In each pass of Relay (except pass1), we apply filtering on the points-to sets of variables based on the points-to results from the previous pass, to improve precision by reducing spurious data flows. Such filtering is formalized as the premises in boxes of [Assign], [Load] and [Call] (Figure 6), where ppt(𝑥) denotes the points-to set of variable 𝑥 (without contexts) in the previous analysis pass (ppt reads as łprevious points-to setž). We ignore contexts in ppt as the contexts of different passes may not be possible to match. These boxed premises ensure that if the previous analysis concludes that variable 𝑥 does not point to object 𝑜𝑖 , then all ⟨_, 𝑜𝑖⟩ will be filtered out from pt(_, 𝑥) in the Proc. ACM Program. Lang., Vol. 5, No. OOPSLA, Article 147. Publication date: October 2021