A Principled Approach to Selective Context Sensitivity for Pointer Analysis 1:15 Algorithm 2:PCMCOLLECTOR Input (Input class) PFG(Precision Flow Graph for class c) Output:PCMc (Precision-Critical Methods for class c) 1 FlowNodes←{h,PCMe←-{} 2 foreach m∈OUTe do foreach return variable r of m do 4 LFlowNodes U=NoDESCANREACH(N.PFG) /Backward graph reachability s foreach NE FlowNodes do 6 if N is a variable node Na and a is declared in m then add m to PCMc if N is an object field node No.f and o is allocated in m then add m to PCM 10 return PCMc Algorithm 3:ZIPPER Input P (Input program) FG (Object Flow Graph for program p) Output:PCM (Precision-Critical Methods for program p) iPCM←-{,S←-set of all statements in p 2 foreach class c in p do PFGePFGBUILDER(OFG,c,S) PCMc+PCMCOLLECTOR(c,PFGe) 5 PCMU=PCMc 6 return PCM method in PFGe.Since ZIPPER builds PFGe starting from the parameters of the IN methods(lines 2-3 in Algorithm 1),all nodes in PFGe are reachable from the IN methods.Therefore,we only need to find out which nodes in PFGc can reach the return variables of Our methods of class c. Algorithm 2(PCMCOLLECTOR)defines the collection of precision-critical methods for an input class c based on PFGc.In line 1,two sets are initialized to empty:FlowNodes denotes the set of nodes that are involved in the flows from IN methods to Our methods of class c,and PCMe denotes the set of precision-critical methods for class c,i.e.,the methods that contain the nodes in FlowNodes. In lines 2-4,PCMCoLLECTOR fills FlowNodes by iterating through the return variables of all OuT methods of c(denoted by OuTe)and collecting all nodes that can reach the return variables in PFGe. The function NoDEsCANREACH used in line 4 is a standard backward graph reachability algorithm which traverses the PFGe starting from N,and returns all nodes that can reach N,on PFGc. In lines 5-9,PCMCoLLECTOR fills PCMe.There are two kinds of nodes in PFGe that are handled differently.For a variable node Na,PCMCoLLECTOR adds the method where the variable a is declared to PCMe(lines 6-7).For an object field node No.f,PCMCoLLECToR adds the method where the abstract object o is allocated to PCMe(lines 8-9).As a result,the algorithm collects the precision- critical methods for each class in a given program. Algorithm 3 shows how ZIppER uses Algorithms 1 and 2 to identify the precision-critical methods PCM for a given program p.With this information,ZIPPER can guide context-sensitive pointer analyses to apply context sensitivity only for the precision-critical methods. ACM Trans.Program.Lang.Syst.,Vol.1,No.1,Article 1.Publication date:January 2020.A Principled Approach to Selective Context Sensitivity for Pointer Analysis 1:15 Algorithm 2: PcmCollector Input : 𝑐 (Input class) PFG𝑐 (Precision Flow Graph for class 𝑐) Output : PCM𝑐 (Precision-Critical Methods for class 𝑐) 1 FlowNodes ← { }, PCM𝑐 ← { } 2 foreach m ∈ Out𝑐 do 3 foreach return variable 𝑟 of 𝑚 do 4 FlowNodes Ð = NodesCanReach(N𝑟, PFG𝑐 ) // Backward graph reachability 5 foreach N ∈ FlowNodes do 6 if 𝑁 is a variable node 𝑁𝑎 and 𝑎 is declared in 𝑚 then 7 add m to PCM𝑐 8 if 𝑁 is an object field node 𝑁𝑜.𝑓 and 𝑜 is allocated in 𝑚 then 9 add m to PCM𝑐 10 return PCM𝑐 Algorithm 3: Zipper Input : 𝑝 (Input program) OFG (Object Flow Graph for program 𝑝) Output : PCM (Precision-Critical Methods for program 𝑝) 1 PCM ← { }, S ← set of all statements in p 2 foreach class c in p do 3 PFG𝑐 ← PfgBuilder(OFG, c, S) 4 PCM𝑐 ← PcmCollector(c, PFG𝑐 ) 5 PCM Ð = PCM𝑐 6 return PCM method in PFG𝑐 . Since Zipper builds PFG𝑐 starting from the parameters of the In methods (lines 2–3 in Algorithm 1), all nodes in PFG𝑐 are reachable from the In methods. Therefore, we only need to find out which nodes in PFG𝑐 can reach the return variables of Out methods of class 𝑐. Algorithm 2 (PcmCollector) defines the collection of precision-critical methods for an input class𝑐 based on PFG𝑐 . In line 1, two sets are initialized to empty: FlowNodes denotes the set of nodes that are involved in the flows from In methods to Out methods of class 𝑐, and PCM𝑐 denotes the set of precision-critical methods for class 𝑐, i.e., the methods that contain the nodes in FlowNodes. In lines 2–4, PcmCollector fills FlowNodes by iterating through the return variables of all Out methods of𝑐 (denoted by Out𝑐 ) and collecting all nodes that can reach the return variables in PFG𝑐 . The function NodesCanReach used in line 4 is a standard backward graph reachability algorithm which traverses the PFG𝑐 starting from 𝑁𝑟 and returns all nodes that can reach 𝑁𝑟 on PFG𝑐 . In lines 5–9, PcmCollector fills PCM𝑐 . There are two kinds of nodes in PFG𝑐 that are handled differently. For a variable node 𝑁𝑎, PcmCollector adds the method where the variable 𝑎 is declared to PCM𝑐 (lines 6–7). For an object field node 𝑁𝑜.𝑓 , PcmCollector adds the method where the abstract object 𝑜 is allocated to PCM𝑐 (lines 8–9). As a result, the algorithm collects the precisioncritical methods for each class in a given program. Algorithm 3 shows how Zipper uses Algorithms 1 and 2 to identify the precision-critical methods PCM for a given program p. With this information, Zipper can guide context-sensitive pointer analyses to apply context sensitivity only for the precision-critical methods. ACM Trans. Program. Lang. Syst., Vol. 1, No. 1, Article 1. Publication date: January 2020