Precision-Guided Context Sensitivity for Pointer Analysis 141:3 2 CAUSES OF IMPRECISION IN CONTEXT-INSENSITIVE POINTER ANALYSIS Our approach is based on the key insight that most of the precision loss in context-insensitive pointer analysis for Java can be expressed in terms of three basic patterns of value flows,or as combinations of these.We assume the reader is familiar with state-of-the-art context-sensitive pointer analysis techniques,e.g.,as covered in several surveys [Ryder 2003;Smaragdakis and Balatsouras 2015;Sridharan et al.2013],however,the precision loss patterns are independent of the chosen variant of context sensitivity,such as call-site sensitivity [Sharir and Pnueli 1981;Shivers 1991],object sensitivity [Milanova et al.2005],and type sensitivity [Smaragdakis et al.2011].In this section,we introduce the three precision loss patterns and then describe three corresponding concrete examples(Sections 2.1-2.3).This characterization of precision loss provides the conceptual foundation for ZIPPER to identify precision-critical methods as explained in Section 3. A context-insensitive analysis does not distinguish between different calls to a method but merges the incoming abstract values(or points-to sets,in the case of pointer object m(object o){ analysis)[Sharir and Pnueli 1981].Figure 1 shows a simple example. 2 return o; If method m is analyzed context-insensitively,then the two objects 3} are mixed together,so the analysis conservatively concludes that 4 x1 new A(): both x2 and y2 may point to both the A object and the B object. 5x2=m(x1): In contrast,a context-sensitive analysis would analyze m twice, 6 y1 new B(); 7y2=m(y1): corresponding to the two different call sites,and thereby conclude Fig.1.Example of precision loss that x2 can only point to an A object and y2 can only point to a B in context-insensitive analysis. object.The price of that extra precision is that the method needs to be analyzed multiple times,so context sensitivity should ideally only be applied when the precision gain outweighs the extra analysis time. To characterize the relevant value flows,we first introduce some terminology. Definition 2.1(IN and Our methods).Given a class C and a method M that is declared in C or inherited from C's super-classes,if M contains one or more parameters then M is an IN method of C,and if M's return type is non-void then M is an Our method of C.(In the example in Figure 1,m is both an IN and an Our method of the surrounding class.) Definition 2.2(Object wrapping and unwrapping).If an object O is stored in a field of an object W (or in an array entry of W,in case W is an array),then O is wrapped into W.Conversely,if an object O is loaded from a field of an object W(or from an array entry of W in case W is an array),then O is unwrapped from W.(The simple example in Figure 1 contains no wrapping or unwrapping.) With these definitions in place,we can describe the three precision-loss patterns as different kinds of value flows,depicted in Figure 2. Definition 2.3(Direct flow).If,in some execution of the program,an object O is passed as a parameter to an IN method M of class C,and then flows(via a series of assignments,field Direct Wrapped Unwrapped Flow Flow Flow IN Method Objects Value flow via assignments, field load/store operations, OUT Method or method calls/returns Fig.2.Three basic patterns of value flow that cause precision loss in context-insensitive analysis. Proc.ACM Program.Lang.,Vol.2,No.OOPSLA,Article 141.Publication date:November 2018.Precision-Guided Context Sensitivity for Pointer Analysis 141:3 2 CAUSES OF IMPRECISION IN CONTEXT-INSENSITIVE POINTER ANALYSIS Our approach is based on the key insight that most of the precision loss in context-insensitive pointer analysis for Java can be expressed in terms of three basic patterns of value flows, or as combinations of these. We assume the reader is familiar with state-of-the-art context-sensitive pointer analysis techniques, e.g., as covered in several surveys [Ryder 2003; Smaragdakis and Balatsouras 2015; Sridharan et al. 2013], however, the precision loss patterns are independent of the chosen variant of context sensitivity, such as call-site sensitivity [Sharir and Pnueli 1981; Shivers 1991], object sensitivity [Milanova et al. 2005], and type sensitivity [Smaragdakis et al. 2011]. In this section, we introduce the three precision loss patterns and then describe three corresponding concrete examples (Sections 2.1ś2.3). This characterization of precision loss provides the conceptual foundation for Zipper to identify precision-critical methods as explained in Section 3. A context-insensitive analysis does not distinguish between different calls to a method but merges 1 2 3 4 5 6 7 Object m(Object o){ return o; } x1 = new A(); x2 = m(x1); y1 = new B(); y2 = m(y1); Fig. 1. Example of precision loss in context-insensitive analysis. the incoming abstract values (or points-to sets, in the case of pointer analysis) [Sharir and Pnueli 1981]. Figure 1 shows a simple example. If method m is analyzed context-insensitively, then the two objects are mixed together, so the analysis conservatively concludes that both x2 and y2 may point to both the A object and the B object. In contrast, a context-sensitive analysis would analyze m twice, corresponding to the two different call sites, and thereby conclude that x2 can only point to an A object and y2 can only point to a B object. The price of that extra precision is that the method needs to be analyzed multiple times, so context sensitivity should ideally only be applied when the precision gain outweighs the extra analysis time. To characterize the relevant value flows, we first introduce some terminology. Definition 2.1 (In and Out methods). Given a class C and a method M that is declared in C or inherited from C’s super-classes, if M contains one or more parameters then M is an In method of C, and if M’s return type is non-void then M is an Out method of C. (In the example in Figure 1, m is both an In and an Out method of the surrounding class.) Definition 2.2 (Object wrapping and unwrapping). If an object O is stored in a field of an object W (or in an array entry ofW , in caseW is an array), then O is wrapped into W . Conversely, if an object O is loaded from a field of an object W (or from an array entry of W in case W is an array), then O is unwrapped from W . (The simple example in Figure 1 contains no wrapping or unwrapping.) With these definitions in place, we can describe the three precision-loss patterns as different kinds of value flows, depicted in Figure 2. Definition 2.3 (Direct flow). If, in some execution of the program, an object O is passed as a parameter to an In method M1 of class C, and then flows (via a series of assignments, field Direct Flow Wrapped Flow Unwrapped Flow IN Method OUT Method Value flow via assignments, field load/store operations, or method calls/returns Objects Fig. 2. Three basic patterns of value flow that cause precision loss in context-insensitive analysis. Proc. ACM Program. Lang., Vol. 2, No. OOPSLA, Article 141. Publication date: November 2018