Program Tailoring:Slicing by Sequential Criteria rtifac Yue Li*,Tian Tan*,Yifei Zhang,and Jingling Xue School of Computer Science and Engineering,UNSW Australia -Abstract Protocol and typestate analyses often report some sequences of statements ending at a program point P that needs to be scrutinized,since P may be erroneous or imprecisely analyzed.Program slicing focuses only on the behavior at P by computing a slice of the program affecting the values at P.In this paper,we propose to restrict our attention to the subset of that behavior at P affected by one or several statement sequences,called a sequential criterion (SC).By leveraging the ordering information in a SC,e.g.,the temporal order in a few valid/invalid API method invocation sequences,we introduce a new technique,program tailoring,to compute a tailored program that comprises the statements in all possible execution paths passing through at least one sequence in SC in the given order.With a prototyping implementation,TAILOR,we show why tailoring is practically useful by conducting two case studies on seven large real-world Java applications.For program debugging and understanding,TAILOR can complement program slicing by removing SCirrelevant statements.For program analysis,TAILOR can enable a pointer analysis,which is unscalable to a program,to perform a more focused and therefore potentially scalable analysis to its SC-relevant parts containing hard language features such as reflection. 1998 ACM Subject Classification F.3.2 Semantics of Programming Languages-Program Ana- lysis,D.2.5 Testing and Debugging-Code inspections and Debugging aids Keywords and phrases Program Slicing,Program Analysis,API Protocol Analysis Digital Object Identifier XXX/LIPIcs.ECOOP.2016. 1 Introduction Program slicing,supported by industry-strength tools,such as WALA [52]and CodeSurfer [18], has found many diverse applications,such as program debugging,comprehension,analysis testing,verification and optimization 8,20,43,49.Given a slicing criterion consisting of a program point P and several variables used at P [53],program slicing computes a slice of the program that may affect their values at P in terms of data and control dependences.In the past three decades,several variations on this theme of program slicing have been proposed, including static vs.dynamic,backward vs.forward,and closure vs.executable 43,49. In practice,API protocol analysis [7,37]and typestate analysis [9,16,34]often report some statement sequences ending at a program point P that needs to be scrutinized,since P may be erroneous or imprecisely analyzed.Each sequence can represent a valid or invalid API usage call sequence.Such protocol specifications or violations can also be provided manually or mined automatically [1,3,17,36,42,60.As such analyses are either conservative or unsound,the temporal order specified in a sequence may or may not be feasible.However, program slicing focuses on P by ignoring this order,which is often essential for analyzing P. As illustrated in Figure 1,we introduce a new technique,program tailoring to reap additional benefits missed by program slicing at a point P(e.g.,file.write())by leveraging the temporal order specified in a new criterion,called a sequential criterion (SC)for P.A These authors contributed equally to this work Yue Li,Tian Tan,Vifei Zhang.and Jingling Xue licensed under Creative Commons License CC-BY 30th European Conference on Object Oriented Programming (ECOOP 2016). Leibniz International Proceedings in Informatics LIPICS Schloss Dagstuhl-Leibniz-Zentrum fuir Informatik,Dagstuhl Publishing,GermanyC onsistent * Complete * Well D ocumen et d t ysaE * o Reuse * * Eva ul det a * EC O O P * Artifact * AE C Program Tailoring: Slicing by Sequential Criteria Yue Li∗ , Tian Tan∗ , Yifei Zhang, and Jingling Xue School of Computer Science and Engineering, UNSW Australia Abstract Protocol and typestate analyses often report some sequences of statements ending at a program point P that needs to be scrutinized, since P may be erroneous or imprecisely analyzed. Program slicing focuses only on the behavior at P by computing a slice of the program affecting the values at P. In this paper, we propose to restrict our attention to the subset of that behavior at P affected by one or several statement sequences, called a sequential criterion (SC). By leveraging the ordering information in a SC, e.g., the temporal order in a few valid/invalid API method invocation sequences, we introduce a new technique, program tailoring, to compute a tailored program that comprises the statements in all possible execution paths passing through at least one sequence in SC in the given order. With a prototyping implementation, Tailor, we show why tailoring is practically useful by conducting two case studies on seven large real-world Java applications. For program debugging and understanding, Tailor can complement program slicing by removing SC-irrelevant statements. For program analysis, Tailor can enable a pointer analysis, which is unscalable to a program, to perform a more focused and therefore potentially scalable analysis to its SC-relevant parts containing hard language features such as reflection. 1998 ACM Subject Classification F.3.2 Semantics of Programming Languages - Program Analysis, D.2.5 Testing and Debugging - Code inspections and Debugging aids Keywords and phrases Program Slicing, Program Analysis, API Protocol Analysis Digital Object Identifier XXX/LIPIcs.ECOOP.2016. 1 Introduction Program slicing, supported by industry-strength tools, such as WALA [52] and CodeSurfer [18], has found many diverse applications, such as program debugging, comprehension, analysis, testing, verification and optimization [8, 20, 43, 49]. Given a slicing criterion consisting of a program point P and several variables used at P [53], program slicing computes a slice of the program that may affect their values at P in terms of data and control dependences. In the past three decades, several variations on this theme of program slicing have been proposed, including static vs. dynamic, backward vs. forward, and closure vs. executable [43, 49]. In practice, API protocol analysis [7, 37] and typestate analysis [9, 16, 34] often report some statement sequences ending at a program point P that needs to be scrutinized, since P may be erroneous or imprecisely analyzed. Each sequence can represent a valid or invalid API usage call sequence. Such protocol specifications or violations can also be provided manually or mined automatically [1, 3, 17, 36, 42, 60]. As such analyses are either conservative or unsound, the temporal order specified in a sequence may or may not be feasible. However, program slicing focuses on P by ignoring this order, which is often essential for analyzing P. As illustrated in Figure 1, we introduce a new technique, program tailoring to reap additional benefits missed by program slicing at a point P (e.g., file.write()) by leveraging the temporal order specified in a new criterion, called a sequential criterion (SC) for P. A ∗ These authors contributed equally to this work © Yue Li, Tian Tan, Yifei Zhang, and Jingling Xue; licensed under Creative Commons License CC-BY 30th European Conference on Object Oriented Programming (ECOOP 2016). Leibniz International Proceedings in Informatics Schloss Dagstuhl – Leibniz-Zentrum für Informatik, Dagstuhl Publishing, Germany