Synthesizing Object State Transformers for Dynamic Software Updates Zelin Zhao*,Yanyan Jiang*,Chang Xu*,Tianxiao Gut,Xiaoxing Ma* *State Key Laboratory for Novel Software Technology and Department of Computer Science and Technology,Nanjing University,China TAlibaba Group,Sunnyvale,CA,USA Abstract-There is an increasing demand for evolving software 1)Almost all (187,or 98.4%)class changes can be updated systems to deliver continuous services of no restart.Dynamic dynamically,indicating that DSU is broadly applicable. software update (DSU)aims to achieve this goal by patching the Even for a few cases (3,or 1.6%)that DSU is impossible system state on the fly but is currently hindered from practice due to non-trivial cross-version object state transformations.This over existing programs,proper refactoring could still make paper revisits this problem through an in-depth empirical study them updatable [4,29.30]. of over 190 class changes from Tomcat 8.The study produced an 2)Most(166,or 87.4%)updates involve trivial object trans- important finding that most non-trivial object state transformers formations over simple predefined rules.Existing DSU can be constructed by reassembling existing old/new version code snippets.This paper presents a domain-specific language systems [15,17,18]are already capable of automatically and an efficient algorithm for synthesizing non-trivial object updating these changes without developers'intervention. transformers over code reuse.We experimentally evaluated our 3)The rest,not many but a non-negligible portion of class tool implementation PASTA with real-world software systems, changes (21,or 11.1%)require non-trivial object transfor- reporting PASTA's effectiveness in succeeding in 7.5x non-trivial mations.Software developers without a DSU background object transformation tasks compared with the best existing DSU would have substantial difficulties in specifying them,as techniques. Index Terms-Software maintenance and evolution,dynamic the required transformers have to carefully manipulate software update,object transformation,program synthesis. two versions of program states simultaneously. The empirical study results suggest that the key obstacle I.INTRODUCTION that hinders the continuous and automatic deployment of DSU in practice is probably how to obtain non-trivial object Dynamic software update (DSU,updating software at transformers.Unfortunately,this circumstance has not been runtime without restarting)[1]is a trending feature in modern seriously recognized by existing research.In fact,our later software systems.DSU keeps systems up-to-date with security experiments show that state-of-the-art techniques,like TOS [27] patches,bug fixes,and feature upgrades without hurting the and AOTES [28],could only succeed in 0 and 2 out of 25 non- systems'availability.DSU has become increasingly practical and compelling [1-8]:Linux Kernel [9-12]and Microsoft trivial object transformation tasks.Their apparent high success rates in past experiments might be due to mixing non-trivial Windows [13]are already dynamically updatable to some extent;Java Virtual Machine (JVM)has been modified to transformers with many trivial ones. In this paper,we leverage another key empirical finding that partially support application updates [14-18];live-upgradable object transformers can be constructed by reassembling existing components are also emerging in databases [19-21],servers [22- old/new version code to establish an algorithm for synthesizing 24],and even mission-critical systems [25]. object transformers in the DSU of Java applications.The Despite that code can be hot upgraded in emerging algorithm exhaustively and heuristically enumerates all possible systems [24,26],automatically updating runtime states for combinations of extracted code snippets,producing both test- seamless system evolution remains a major research chal- passing and developer-readable object transformers.A key lenge [15,27,28].Software updates may include a field of a advantage over existing techniques [27,28]is that an appli- class being added,removed,or semantically changed in a new cation developer can easily verify synthesized transformers' version.In DSU,such field values (if not removed)should be correctness because application code is their major constructs. (re)computed to match the new version code's semantics.This We implemented our algorithm as the PASTA (PATCH is known as the object transformation problem,whose solution STATES)tool for DSU of Java programs.The evaluation typically relies on a key mini-program (a.k.a.a transformer) results over a set of non-trivial class changes (including for computing these field values at update time. those in the empirical study and more)were encouraging: To understand the challenges in object transformation,this PASTA synthesized 7.5x correct non-trivial object transformers paper empirically studied 100 uniform-randomly sampled (60.0%)compared to the best existing techniques TOS [27] commits (consisting of 190 class changes)from Apache and AOTES [28](0.0%and 8.0%,respectively). Tomcat 8 [23],one of the most popular Web backend systems. In summary,this paper's major contributions are recognizing Our major findings include: the non-trivial object transformer synthesis as a critical problem
Synthesizing Object State Transformers for Dynamic Software Updates Zelin Zhao∗ , Yanyan Jiang∗ , Chang Xu∗ , Tianxiao Gu† , Xiaoxing Ma∗ ∗State Key Laboratory for Novel Software Technology and Department of Computer Science and Technology, Nanjing University, China †Alibaba Group, Sunnyvale, CA, USA Abstract—There is an increasing demand for evolving software systems to deliver continuous services of no restart. Dynamic software update (DSU) aims to achieve this goal by patching the system state on the fly but is currently hindered from practice due to non-trivial cross-version object state transformations. This paper revisits this problem through an in-depth empirical study of over 190 class changes from Tomcat 8. The study produced an important finding that most non-trivial object state transformers can be constructed by reassembling existing old/new version code snippets. This paper presents a domain-specific language and an efficient algorithm for synthesizing non-trivial object transformers over code reuse. We experimentally evaluated our tool implementation PASTA with real-world software systems, reporting PASTA’s effectiveness in succeeding in 7.5× non-trivial object transformation tasks compared with the best existing DSU techniques. Index Terms—Software maintenance and evolution, dynamic software update, object transformation, program synthesis. I. INTRODUCTION Dynamic software update (DSU, updating software at runtime without restarting) [1] is a trending feature in modern software systems. DSU keeps systems up-to-date with security patches, bug fixes, and feature upgrades without hurting the systems’ availability. DSU has become increasingly practical and compelling [1–8]: Linux Kernel [9–12] and Microsoft Windows [13] are already dynamically updatable to some extent; Java Virtual Machine (JVM) has been modified to partially support application updates [14–18]; live-upgradable components are also emerging in databases [19–21], servers [22– 24], and even mission-critical systems [25]. Despite that code can be hot upgraded in emerging systems [24, 26], automatically updating runtime states for seamless system evolution remains a major research challenge [15, 27, 28]. Software updates may include a field of a class being added, removed, or semantically changed in a new version. In DSU, such field values (if not removed) should be (re)computed to match the new version code’s semantics. This is known as the object transformation problem, whose solution typically relies on a key mini-program (a.k.a. a transformer) for computing these field values at update time. To understand the challenges in object transformation, this paper empirically studied 100 uniform-randomly sampled commits (consisting of 190 class changes) from Apache Tomcat 8 [23], one of the most popular Web backend systems. Our major findings include: 1) Almost all (187, or 98.4%) class changes can be updated dynamically, indicating that DSU is broadly applicable. Even for a few cases (3, or 1.6%) that DSU is impossible over existing programs, proper refactoring could still make them updatable [4, 29, 30]. 2) Most (166, or 87.4%) updates involve trivial object transformations over simple predefined rules. Existing DSU systems [15, 17, 18] are already capable of automatically updating these changes without developers’ intervention. 3) The rest, not many but a non-negligible portion of class changes (21, or 11.1%) require non-trivial object transformations. Software developers without a DSU background would have substantial difficulties in specifying them, as the required transformers have to carefully manipulate two versions of program states simultaneously. The empirical study results suggest that the key obstacle that hinders the continuous and automatic deployment of DSU in practice is probably how to obtain non-trivial object transformers. Unfortunately, this circumstance has not been seriously recognized by existing research. In fact, our later experiments show that state-of-the-art techniques, like TOS [27] and AOTES [28], could only succeed in 0 and 2 out of 25 nontrivial object transformation tasks. Their apparent high success rates in past experiments might be due to mixing non-trivial transformers with many trivial ones. In this paper, we leverage another key empirical finding that object transformers can be constructed by reassembling existing old/new version code to establish an algorithm for synthesizing object transformers in the DSU of Java applications. The algorithm exhaustively and heuristically enumerates all possible combinations of extracted code snippets, producing both testpassing and developer-readable object transformers. A key advantage over existing techniques [27, 28] is that an application developer can easily verify synthesized transformers’ correctness because application code is their major constructs. We implemented our algorithm as the PASTA (PATCH STATES) tool for DSU of Java programs. The evaluation results over a set of non-trivial class changes (including those in the empirical study and more) were encouraging: PASTA synthesized 7.5× correct non-trivial object transformers (60.0%) compared to the best existing techniques TOS [27] and AOTES [28] (0.0% and 8.0%, respectively). In summary, this paper’s major contributions are recognizing the non-trivial object transformer synthesis as a critical problem
in DSU and providing it with an effective approach.The rest of 1class SocketProcessor 2-private NioChannel socket null; the paper is organized as follows.Section II gives the necessary 3+private KeyAttachment ka null; background knowledge of DSU with a motivating example. 4 private SocketStatus status null; Section IlI presents a comprehensive study on DSU of 190 s public void run(){ 6+ NioChannel socket ka.getSocket(); class changes in Tomcat 8.Our DSL and synthesis algorithm 7 SelectionKey key socket.getIOChannel().keyFor( are elaborated on in Sections IV and V,respectively.The socket.getPoller().getSelector()); evaluation of PASTA against real-world updates is described 9- KeyAttachment ka null; 18- in Section VI,followed by threats to validity discussions in if (key !null) 11 ka =(KeyAttachment)key.attachment() Section VⅡ,related work in Section VⅢ,and conclusion in 12 Section IX. 13] 14 II.BACKGROUND AND MOTIVATION 1sclass DSUHelper 16 static void transform(SocketProcessor*obj,SocketProcessor stale){ A.DSU Systems and Object Transformation 17 /trivial default transformation for status obj.status stale.status; This paper focuses on the DSU of Java programs',which 18 19 /non-trivial transformation for ka consists of the following four steps: 28 obj.ka null; 1)Pause the program under update at a safe point [10,31], NioChannel socket stale.socket; if (socket !null){ e.g.,when all updated code is popped off the stack [15,17]. 23 SelectionKey key socket.getIoChannel().keyFor( 2)Upgrade the changed code [32,33]via dynamic link- 24 socket.getPoller().getSelector()); ing [34],live patching [10],or hotswap [26]. 25 if (key !null) 6 obj.ka (KeyAttachment)key.attachment(); 3)Transform stale (old-version)objects in the heap to their 27 new state [27,281. 28 4)Resume the updated program's execution.The new version 29] is now ready to serve. Fig.1:A class change in Tomcat-8(commit #f4451c)whose Object transformation(the third step)is this paper's primary object transformation is non-trivial.DSUHelper is our manually focus.When a program is paused at an update-safe point provided object transformer.At update time,the DSU system with code being upgraded,the heap may contain stale objects for each(stale)object sp (of type SocketProcessor)creates whose values are inconsistent with the new-version code.A its corresponding uninitialized new-version object sp*(of type DSU system must for each such object invoke its transformer SocketProcessor",the same class after update)and invokes to migrate to its corresponding new-version. the object transformer DSUHelper.transform(sp*,sp). B.Motivating Example Figure I lists a class change to SocketProcessor,which the latter is typically lacking for most application developers. requires a non-trivial transformation.This class change replaces Sometimes,a DSU system may automatically synthesize a the socket field by ka with a type change from NioChannel non-trivial object transformer,however,our empirical study to KeyAttachment (Lines 2-3).We correspondingly provide results in Section III show that existing techniques fall short on an object transformer DSUHelper.transform(Lines 16-28). most real-world cases.For this motivating example,TOS [27] The status field undergoes a default (or trivial)transfor- incorrectly falls back to the default null assignment because mation:it inherits its value from the old-version (Line 18).the non-trivial transformer is beyond TOS's search capability A default transformation copies the old-version value for a AOTES [28]also fails in synthesizing a method history for type-unchanged field or assigns a default value (e.g.,0 for int such complex objects. and null for references)to a newly-added field [15,17]. C.Discussions However,the ka field requires a non-trivial transformation If we leave ka with a default null reference,the program will Interestingly,the key non-trivial step in our manually quickly crash after DSU.Our transformer in Figure 1 leverages provided transformer,which retrieves the SelectionKey from the program's implicit invariant that there is a 1-to-1 mapping an NioChannel object in Lines 23-24,is identical to the code between NioChannel objects and KeyAttachment objects in in Lines 7-8.The null-check in the transformer (Lines 25-26) the heap.Lines 21-26 invoke a chain of I/O channel APIs to can also be found in the old-version code (Lines 10-11),which find stale.socket's corresponding KeyAttachment object. is removed in the new version because the local variable ka is Providing non-trivial object transformers is considerably available through the newly added field (Line 3). challenging even for experienced developers:it requires ex- This should not be considered completely incidental.If pertise in both the application logic and DSU system,where there is a code snippet for computing an object's property that reflects an internal invariant(potentially useful for object IDSU and object transformation for unmanaged heaps (e.g.,C/C++)are transformation like the code that finds the SelectionKey for considerably different and are out of this paper's scope.However,arguments a given NioChannel object),the code snippet might also be in this paper can also be applied to other managed runtime systems. 2An object transformer is considered trivial if it contains only default useful to other parts of the program and is likely to exist in transformations,otherwise is non-trivial. the source code
in DSU and providing it with an effective approach. The rest of the paper is organized as follows. Section II gives the necessary background knowledge of DSU with a motivating example. Section III presents a comprehensive study on DSU of 190 class changes in Tomcat 8. Our DSL and synthesis algorithm are elaborated on in Sections IV and V, respectively. The evaluation of PASTA against real-world updates is described in Section VI, followed by threats to validity discussions in Section VII, related work in Section VIII, and conclusion in Section IX. II. BACKGROUND AND MOTIVATION A. DSU Systems and Object Transformation This paper focuses on the DSU of Java programs1 , which consists of the following four steps: 1) Pause the program under update at a safe point [10, 31], e.g., when all updated code is popped off the stack [15, 17]. 2) Upgrade the changed code [32, 33] via dynamic linking [34], live patching [10], or hotswap [26]. 3) Transform stale (old-version) objects in the heap to their new state [27, 28]. 4) Resume the updated program’s execution. The new version is now ready to serve. Object transformation (the third step) is this paper’s primary focus. When a program is paused at an update-safe point with code being upgraded, the heap may contain stale objects whose values are inconsistent with the new-version code. A DSU system must for each such object invoke its transformer to migrate to its corresponding new-version. B. Motivating Example Figure 1 lists a class change to SocketProcessor, which requires a non-trivial transformation. This class change replaces the socket field by ka with a type change from NioChannel to KeyAttachment (Lines 2–3). We correspondingly provide an object transformer DSUHelper.transform (Lines 16–28). The status field undergoes a default (or trivial) transformation: it inherits its value from the old-version (Line 18). A default transformation copies the old-version value for a type-unchanged field or assigns a default value (e.g., 0 for int and null for references) to a newly-added field [15, 17]. However, the ka field requires a non-trivial transformation2 . If we leave ka with a default null reference, the program will quickly crash after DSU. Our transformer in Figure 1 leverages the program’s implicit invariant that there is a 1-to-1 mapping between NioChannel objects and KeyAttachment objects in the heap. Lines 21–26 invoke a chain of I/O channel APIs to find stale.socket’s corresponding KeyAttachment object. Providing non-trivial object transformers is considerably challenging even for experienced developers: it requires expertise in both the application logic and DSU system, where 1DSU and object transformation for unmanaged heaps (e.g., C/C++) are considerably different and are out of this paper’s scope. However, arguments in this paper can also be applied to other managed runtime systems. 2An object transformer is considered trivial if it contains only default transformations, otherwise is non-trivial. 1 class SocketProcessor { 2 - private NioChannel socket = null; 3 + private KeyAttachment ka = null; 4 private SocketStatus status = null; 5 public void run() { 6 + NioChannel socket = ka.getSocket(); 7 SelectionKey key = socket.getIOChannel().keyFor( 8 socket.getPoller().getSelector()); 9 - KeyAttachment ka = null; 10 - if (key != null) 11 - ka = (KeyAttachment)key.attachment(); 12 ... } ... 13 } 14 15 class DSUHelper { 16 static void transform(SocketProcessor? obj, SocketProcessor stale) { 17 // trivial default transformation for status 18 obj.status = stale.status; 19 // non-trivial transformation for ka 20 obj.ka = null; 21 NioChannel socket = stale.socket; 22 if (socket != null) { 23 SelectionKey key = socket.getIOChannel().keyFor( 24 socket.getPoller().getSelector()); 25 if (key != null) 26 obj.ka = (KeyAttachment)key.attachment(); 27 } 28 } 29 } Fig. 1: A class change in Tomcat-8 (commit #f4451c) whose object transformation is non-trivial. DSUHelper is our manually provided object transformer. At update time, the DSU system for each (stale) object sp (of type SocketProcessor) creates its corresponding uninitialized new-version object sp? (of type SocketProcessor? , the same class after update) and invokes the object transformer DSUHelper.transform(sp?, sp). the latter is typically lacking for most application developers. Sometimes, a DSU system may automatically synthesize a non-trivial object transformer, however, our empirical study results in Section III show that existing techniques fall short on most real-world cases. For this motivating example, TOS [27] incorrectly falls back to the default null assignment because the non-trivial transformer is beyond TOS’s search capability. AOTES [28] also fails in synthesizing a method history for such complex objects. C. Discussions Interestingly, the key non-trivial step in our manually provided transformer, which retrieves the SelectionKey from an NioChannel object in Lines 23–24, is identical to the code in Lines 7–8. The null-check in the transformer (Lines 25–26) can also be found in the old-version code (Lines 10–11), which is removed in the new version because the local variable ka is available through the newly added field (Line 3). This should not be considered completely incidental. If there is a code snippet for computing an object’s property that reflects an internal invariant (potentially useful for object transformation like the code that finds the SelectionKey for a given NioChannel object), the code snippet might also be useful to other parts of the program and is likely to exist in the source code
not dynamically updatable FINDING 1.Almost all changed classes (187/190,or 98.4%) 1.6% ▣non-trivial(need a transformer) are dynamically updatable using either trivial default or non- trivial provided object transformers. non-trivial (need update-time config) 700 69 trivial (object unchanged) In the three failing cases,two of them added new fields trivial (object changed) whose values are only available in an already popped stack frame.Another one is a fix for a resource leak in which whether an object is leaked cannot be effectively determined. Fig.2:Taxonomy of the 190 changed classes in Tomcat 8. Fortunately,refactoring the program to discard partial states at a component level [4,29,30]can make them updatable. This observation motivates us to explore the possibility of Furthermore,we found that simple default object transfor- mation suffices in most cases: automatically synthesizing object transformers by reassembling existing code snippets,including both old and new versions of a FINDING 2.Most class changes (166/190,or 87.4%)can be program.This observation is further validated in our empirical dynamically updated via trivial object transformers. study in Section III,and then implemented as a heuristic search 133 out of the 166 class changes (80.1%)involve only algorithm in Section V. code logic upgrades that do not affect the concerned objects' III.EMPIRICAL STUDY data representations,i.e.,field values are unchanged.A typical example is a security patch.The rest 33(19.9%)class changes In this short empirical study,we seek insights for under- can be automatically handled by a DSU system's default standing the challenges of object transformation in DSU over policy [15,17,18],e.g.,assigning a newly created field with a a set of randomly sampled real-world class changes. default value or garbage collecting a removed field's referred objects. A.Methodology Finally,class changes that require non-trivial object trans- We empirically studied the applicability of DSU in the formers are of particular research interest: evolution of Apache Tomcat 8 [23],one of the most popular Java Web servers.Tomcat 8 is still under active maintenance FINDING 3.Not many but non-negligible class changes(21/190, upgrades since its first release in 2013,making it a suitable or 11.1%)require non-trivial object transformers.These subject for studying DSU.We uniform-randomly sampled changes substantially hinder the application of DSU in practice. 100 commits from all 2,114 Tomcat 8 commits in its entire For these updates,the upgrade maintainer can manually maintenance history by the paper was written (from 8.0.0 to provide an object transformer to enable DSU over such non- the latest release 8.0.53).The sampled commits consist of in trivial class changes.However,this is not an easy task because total 190 class changes3. non-trivial object transformers usually exploit a program's For each changed class,we manually inspected the program implicit invariants or object state constraints (like the example state at a hypothetical update-safe point in which all changed in Figure 1).Automatic transformer synthesis [27,28]can methods of the class are popped off the stack.We determine be a promising and highly-preferred solution.Unfortunately. whether object transformation is possible(i.e.,whether DSU is our later experiments show that even the best state-of-the-art applicable)at that point and try to provide each of 2,957 fields technique produces correct transformations in <10%of these in the 190 changed classes a transformer.Given a class change non-trivial class changes. that can be dynamically updated,its object transformer is Therefore,the general unavailability of non-trivial object considered trivial if all of its field transformations are default transformers should be recognized as a key obstacle in making (explained in Section ID).Otherwise,the non-trivial object DSU practical.To address this challenge,we examined the transformer has at least one field that requires non-trivial characteristics of our manually crafted object transformers to transformation (like ka in Figure 1). find potentially useful guidance for automatic object trans- To validate our observation that non-trivial transformers can former synthesis.Figure 3 summarizes the basic constructs be constructed by reassembling existing code snippets,we in our manual transformers,which can be concluded by the preferred reusing old/new version code statements with minor following finding: revisions.We collect and analyze the statistics of those reused FINDING 4.Default transformations and existing code snippets statements in constructing transformers. are the major constructs of a non-trivial object transformer. B.Results and Findings The basic constructs of the 21 non-trivial object transformers The statistics in Figure 2 first indicate that DSU can be are:42 right-hand side expressions of assignments,15 if- broadly applicable in a program's maintenance lifetime: then-else branch conditions,and 2 for-each loop conditions. Understanding the characteristics of these basic constructs 3Commits that do not change the Tomcat-core source code (e.g..documen- is critical to the development of an automatic transformer tation or test case updates)are excluded from the study because they are irrelevant to DSU.190 are all class changes because changes to Tomcat 8 are 4Dynamically updating such a class with a default transformer will result mainly maintenance upgrades. in a crash or broken application logic
1.6% 9.5% 1.6% 70.0% 17.4% not dynamically updatable non-trivial (need a transformer) non-trivial (need update-time config) trivial (object unchanged) trivial (object changed) Fig. 2: Taxonomy of the 190 changed classes in Tomcat 8. This observation motivates us to explore the possibility of automatically synthesizing object transformers by reassembling existing code snippets, including both old and new versions of a program. This observation is further validated in our empirical study in Section III, and then implemented as a heuristic search algorithm in Section V. III. EMPIRICAL STUDY In this short empirical study, we seek insights for understanding the challenges of object transformation in DSU over a set of randomly sampled real-world class changes. A. Methodology We empirically studied the applicability of DSU in the evolution of Apache Tomcat 8 [23], one of the most popular Java Web servers. Tomcat 8 is still under active maintenance upgrades since its first release in 2013, making it a suitable subject for studying DSU. We uniform-randomly sampled 100 commits from all 2,114 Tomcat 8 commits in its entire maintenance history by the paper was written (from 8.0.0 to the latest release 8.0.53). The sampled commits consist of in total 190 class changes3 . For each changed class, we manually inspected the program state at a hypothetical update-safe point in which all changed methods of the class are popped off the stack. We determine whether object transformation is possible (i.e., whether DSU is applicable) at that point and try to provide each of 2,957 fields in the 190 changed classes a transformer. Given a class change that can be dynamically updated, its object transformer is considered trivial if all of its field transformations are default (explained in Section II). Otherwise, the non-trivial object transformer has at least one field that requires non-trivial transformation (like ka in Figure 1). To validate our observation that non-trivial transformers can be constructed by reassembling existing code snippets, we preferred reusing old/new version code statements with minor revisions. We collect and analyze the statistics of those reused statements in constructing transformers. B. Results and Findings The statistics in Figure 2 first indicate that DSU can be broadly applicable in a program’s maintenance lifetime: 3Commits that do not change the Tomcat-core source code (e.g., documentation or test case updates) are excluded from the study because they are irrelevant to DSU. 190 are all class changes because changes to Tomcat 8 are mainly maintenance upgrades. FINDING 1. Almost all changed classes (187/190, or 98.4%) are dynamically updatable using either trivial default or nontrivial provided object transformers. In the three failing cases, two of them added new fields whose values are only available in an already popped stack frame. Another one is a fix for a resource leak in which whether an object is leaked cannot be effectively determined. Fortunately, refactoring the program to discard partial states at a component level [4, 29, 30] can make them updatable. Furthermore, we found that simple default object transformation suffices in most cases: FINDING 2. Most class changes (166/190, or 87.4%) can be dynamically updated via trivial object transformers. 133 out of the 166 class changes (80.1%) involve only code logic upgrades that do not affect the concerned objects’ data representations, i.e., field values are unchanged. A typical example is a security patch. The rest 33 (19.9%) class changes can be automatically handled by a DSU system’s default policy [15, 17, 18], e.g., assigning a newly created field with a default value or garbage collecting a removed field’s referred objects. Finally, class changes that require non-trivial object transformers are of particular research interest: FINDING 3. Not many but non-negligible class changes (21/190, or 11.1%) require non-trivial object transformers4 . These changes substantially hinder the application of DSU in practice. For these updates, the upgrade maintainer can manually provide an object transformer to enable DSU over such nontrivial class changes. However, this is not an easy task because non-trivial object transformers usually exploit a program’s implicit invariants or object state constraints (like the example in Figure 1). Automatic transformer synthesis [27, 28] can be a promising and highly-preferred solution. Unfortunately, our later experiments show that even the best state-of-the-art technique produces correct transformations in <10% of these non-trivial class changes. Therefore, the general unavailability of non-trivial object transformers should be recognized as a key obstacle in making DSU practical. To address this challenge, we examined the characteristics of our manually crafted object transformers to find potentially useful guidance for automatic object transformer synthesis. Figure 3 summarizes the basic constructs in our manual transformers, which can be concluded by the following finding: FINDING 4. Default transformations and existing code snippets are the major constructs of a non-trivial object transformer. The basic constructs of the 21 non-trivial object transformers are: 42 right-hand side expressions of assignments, 15 ifthen-else branch conditions, and 2 for-each loop conditions. Understanding the characteristics of these basic constructs is critical to the development of an automatic transformer 4Dynamically updating such a class with a default transformer will result in a crash or broken application logic
Statements in a transformer SelectionKey key socket.getIOChannel().keyFor( socket.getPoller().getSelector()); Assignment(42) Conditidnal (15) Loop (2) extraction V=E if(E){…3 for (x:E) g1=(1.getIOChannel().keyFor(2.getPoller().getSelector())) else {... {..3 E(42) E(15) E(2) if (name.startswith("selectorPool.")) extraction Source 8 Source g2 =(1.startswith(2)) Null-check 93 =("selectorPool.") 7 94 =(01.startswith("selectorPool.")) 12 Fig.3:Statistics of the basic constructs in the studied non-trivial while (paused &!running) object transformers. extraction 95=(o1&D2) Transformer Fig.5:Examples of extracted gadgets. Statement v e; obj.f=e; if (c){s*else s*) while (c){s*} A.The Language Design Condition = ee =null!c Following the empirical findings that default transformations Expression vIg(v) and existing code snippets are the major constructs of a non- Gadget 9= extracted gadgets trivial object transformer,we made the following choices in Variable 0 = stale v v2... the DSL design: 1)Providing a mechanism for code reuse.Particularly,we Fig.4:Syntax of basic constructs in object transformers.f is provide a DSL construct named gadget,as denoted by a field subject to transformation;stale and obj are the stale g(),a textural expression extracted from source code object and its corresponding new-version object;a*denotes with all variable references being replaced by a place- zero or more repeats of a. holder.We use angled brackets to enclose a gadget,e.g., g(o1,☐2,☐3)=(O1.foo(o2,☐3)).Applying a gadget to an object transformer would reuse the entire expression synthesis mechanism.As shown in Figure 3,the vast majority with the flexibility for placeholders to be filled with (54/59,or 91.5%)of the basic constructs are either: transformer-specific values. 1)a trivial default behavior (13/59,or 22.0%), 2)Providing no arithmetic,logical,or bitwise operator.We 2)a single member method call (12/59,or 20.3%), argue that when such operators (+,&&1,...)should 3)a simple null-check (7/59,or 11.9%),or appear in an object transformer,they will also likely to 4)a minor revision of an existing source code snippet (22/59, exist in old/new version source code and can be extracted or 37.3%)like Lines 23-24 in Figure 1. as gadgets.Therefore,we do not have to include them in Such a result motivated us to synthesize object transformers the DSL.yielding a minimal,concise DSL by assembling source code gadgets (extracted expressions from 3)Providing limited expressiveness for branch/loop condi- old/version source code)upon a domain-specific language tions.Branch/loop conditions in object transformers are designated for the object transformation in DSU. also likely in the existing source code.Therefore,nega- For the remaining a few (5/59,or 8.5%)basic constructs, tions,nested branches,and while-loops provide sufficient three of them are boolean configuration-related constants expressiveness for constructing object transformers. whose values are determined by an update-time configuration. Figure 4 lists the syntax of our DSL.An object transformer The other two expressions need a reference that is not reachable t is a sequence of statements s*,in which each field of the from stale objects.Since this paper focuses on the automatic new-version object obj is assigned with a value.For each synthesis of object transformers,we leave these relatively rare statement s,it can define a new variable v:by applying a cases to future work. gadget g and filling its placeholders with existing variables(a previously defined vi or stale),assign a field to be transformed IV.DOMAIN-SPECIFIC LANGUAGE FOR OBJECT (obj.f)with a value,or use if branches or while loops with TRANSFORMATION a condition c. This section explains our design goals and choices in Readers may notice that the statements in a transformer t our domain-specific language (DSL)for describing object describe a skeleton,which specifies the targeted transformer's transformers.The DSL design and gadget extraction are 5We allow a void-typed expression to be assigned to a variable,i.e..g can described in Sections IV-A and IV-B,respectively. be a void method invocation
Default Member 13 12 Source 12 Others 5 Source 8 Null-check 7 Source 2 v = E if (E) {...} else {...} for (x: E) {...} Statements in a transformer E (42) E (15) E (2) Assignment (42) Conditional (15) Loop (2) Fig. 3: Statistics of the basic constructs in the studied non-trivial object transformers. Transformer t ::= s ∗ Statement s ::= v = e; | obj.f = e; | if (c) { s ∗ } else { s ∗ } | while (c) { s ∗ } Condition c ::= e | e == null | !c Expression e ::= v | g(v ∗ ) Gadget g ::= extracted gadgets Variable v ::= stale | v1 | v2 | . . . Fig. 4: Syntax of basic constructs in object transformers. f is a field subject to transformation; stale and obj are the stale object and its corresponding new-version object; a ∗ denotes zero or more repeats of a. synthesis mechanism. As shown in Figure 3, the vast majority (54/59, or 91.5%) of the basic constructs are either: 1) a trivial default behavior (13/59, or 22.0%), 2) a single member method call (12/59, or 20.3%), 3) a simple null-check (7/59, or 11.9%), or 4) a minor revision of an existing source code snippet (22/59, or 37.3%) like Lines 23–24 in Figure 1. Such a result motivated us to synthesize object transformers by assembling source code gadgets (extracted expressions from old/version source code) upon a domain-specific language designated for the object transformation in DSU. For the remaining a few (5/59, or 8.5%) basic constructs, three of them are boolean configuration-related constants whose values are determined by an update-time configuration. The other two expressions need a reference that is not reachable from stale objects. Since this paper focuses on the automatic synthesis of object transformers, we leave these relatively rare cases to future work. IV. DOMAIN-SPECIFIC LANGUAGE FOR OBJECT TRANSFORMATION This section explains our design goals and choices in our domain-specific language (DSL) for describing object transformers. The DSL design and gadget extraction are described in Sections IV-A and IV-B, respectively. SelectionKey key = socket.getIOChannel().keyFor( socket.getPoller().getSelector()); g1 = h1.getIOChannel().keyFor(2.getPoller().getSelector())i extraction if (name.startsWith("selectorPool.")) g2 = h1.startsWith(2)i g3 = h"selectorPool."i g4 = h1.startsWith("selectorPool.")i extraction while (paused && !running) g5 = h1 && !2i extraction Fig. 5: Examples of extracted gadgets. A. The Language Design Following the empirical findings that default transformations and existing code snippets are the major constructs of a nontrivial object transformer, we made the following choices in the DSL design: 1) Providing a mechanism for code reuse. Particularly, we provide a DSL construct named gadget, as denoted by g(~), a textural expression extracted from source code with all variable references being replaced by a placeholder. We use angled brackets to enclose a gadget, e.g., g(1, 2, 3) = h1.foo(2,3)i. Applying a gadget to an object transformer would reuse the entire expression with the flexibility for placeholders to be filled with transformer-specific values. 2) Providing no arithmetic, logical, or bitwise operator. We argue that when such operators (+, &&, |, . . .) should appear in an object transformer, they will also likely to exist in old/new version source code and can be extracted as gadgets. Therefore, we do not have to include them in the DSL, yielding a minimal, concise DSL. 3) Providing limited expressiveness for branch/loop conditions. Branch/loop conditions in object transformers are also likely in the existing source code. Therefore, negations, nested branches, and while-loops provide sufficient expressiveness for constructing object transformers. Figure 4 lists the syntax of our DSL. An object transformer t is a sequence of statements s ∗ , in which each field of the new-version object obj is assigned with a value. For each statement s, it can define a new variable vi by applying a gadget g 5 and filling its placeholders with existing variables (a previously defined vi or stale), assign a field to be transformed (obj.f) with a value, or use if branches or while loops with a condition c. Readers may notice that the statements in a transformer t describe a skeleton, which specifies the targeted transformer’s 5We allow a void-typed expression to be assigned to a variable, i.e., g can be a void method invocation.
control flow (branches and loops)and data flow (variables and their dependencies).All concrete transformation operations g1(1)=(01.socket) are performed by gadgets extracted from the source code 92(1,2)=(1.getIoChannel().keyFor( Such a separation of concerns not only maximally reuses 02.getPoller().getSelector())) existing source code,but also gives considerable flexibility to 93(1)=((KeyAttachment)01.attachment()) implement diverse transformers.This design also facilitates our 1class DSUHelper 2 static void transform(SocketProcessor*obj,SocketProcessor stale){ later heuristic synthesis algorithm to prioritize likely relevant v1 =(NioChannel)g1(stale); object transformers by measuring both the structural complexity 4 if (v1 !null){ of a synthesized transformer and its "naturalness"in gadget v2 =(SelectionKey)g2(v1,v1); 6 if (v2 !null){ use. v3 =(KeyAttachment)g3(v2); One final note is that we restrict branch/loop conditions to be either of e,!e,e ==null,or e !null,where expression 9 e is either a variable or a gadget application.Theoretically. 1 obj.ka va; 11] any condition can be expressed by negation()and nested 12] branch (A),but both our DSL and synthesis algorithm favor simple branch/loop conditions that reuse existing source code Fig.6:A field transformer for field ka in the motivating snippets. example written in our DSL,and extracted gadgets.All used B.Gadget Extraction variables were initialized with default values,e.g.,v3=null. In gadget extraction,trade-offs must be made to balance the DSL's expressiveness and its synthesis difficulty.In an imprac- Algorithm 1:The transformer synthesis framework tical extreme,one can include all Java language constructs as 1 Function SYNTHESIS(G) gadgets.This allows our DSL to be essentially equivalent to the 2 Q←-{>}: while Q≠ado vanilla Java.However,synthesizing Java programs directly for p← arg min cost(p'); object transformation is considerably difficult and not practical. P'EQ The key trade-off we made is to only extract complete source- iflb走p then code statements as gadgets.We argue that no matter how many yield p: times method invocations,arithmetic/logical operations,etc. Q←QUDc(p)\{p} are performed in a statement,they should either be all used or entirely not used in an object transformer.The intuition behind this treatment is simple:each statement should contain a logically inseparable action in a well-maintained project for class method gadgets (methodName()),static field best readability and maintainability. gadgets (ClassName.fieldName),static method gadgets Gadgets are extracted by iterating over all statements in all (ClassName.methodName()),and object creation gad- application classes in both the old and new version source code gets (new className()).These rules are also addition- using the following rules: ally applied for the Java Standard Library for extracting 1)For a statement's associated expression (i.e..the right- potentially useful API calls,e.g.,container operations. hand side of an assignment or an if/while condition), Figure 6 gives a transformer example for field ka in our we parse it into an abstract syntax tree (AST)and replace motivating example (Figure 1).Three used gadgets g,92. each variable or constant node with a placeholder to and g3 are extracted using rules #4,#1,and #1,respectively. be a gadget.This rule yields g,g2.and g5 in Figure 5. This transformer is equivalent to the manually provided one in 2)For each statement,we consider its contained constants Figure 1. potentially useful in synthesizing object transformers. One could expect that our DSL and gadget extraction rules Therefore,each constant literal in the statement is also suffice for object transformer synthesis.Unfortunately,there extracted as a gadget.This rule yields g3 in Figure 5. can be millions of gadgets extracted from a large codebase. 3)For each statement,we also extract it into a gadget where Certainly,not all gadget combinations are equally relevant to a only variable nodes are replaced by placeholders,i.e.,given upgrade.The relevance of a gadget to the class change keeping all constants as-is in the gadget compared with and the similarity between a gadget combination and existing the first rule.This is because constants may be inseparable source code would serve as the guidance for efficient object from the statement's computational logic.This rule yields transformer synthesis,which is described as follows. 94 in Figure 5. 4)For each class in both old and new versions of the V.AUTOMATIC SYNTHESIS OF OBJECT TRANSFORMERS source code,we extract class field gadgets (fieldName), Conceptually,object transformer synthesis is simple:a 6There can be occasions that a statement consists of multiple actions.e.ga systematic enumeration of all syntactically correct programs chain of method invocations.We optimistically believe that the desired action will eventually find a correct transformer(Section V-A).This will independently appear elsewhere in the codebase. section presents our heuristic search algorithm for efficiently pri-
control flow (branches and loops) and data flow (variables and their dependencies). All concrete transformation operations are performed by gadgets extracted from the source code. Such a separation of concerns not only maximally reuses existing source code, but also gives considerable flexibility to implement diverse transformers. This design also facilitates our later heuristic synthesis algorithm to prioritize likely relevant object transformers by measuring both the structural complexity of a synthesized transformer and its “naturalness” in gadget use. One final note is that we restrict branch/loop conditions to be either of e, !e, e == null, or e != null, where expression e is either a variable or a gadget application. Theoretically, any condition can be expressed by negation (¬) and nested branch (∧), but both our DSL and synthesis algorithm favor simple branch/loop conditions that reuse existing source code snippets. B. Gadget Extraction In gadget extraction, trade-offs must be made to balance the DSL’s expressiveness and its synthesis difficulty. In an impractical extreme, one can include all Java language constructs as gadgets. This allows our DSL to be essentially equivalent to the vanilla Java. However, synthesizing Java programs directly for object transformation is considerably difficult and not practical. The key trade-off we made is to only extract complete sourcecode statements as gadgets. We argue that no matter how many times method invocations, arithmetic/logical operations, etc. are performed in a statement, they should either be all used or entirely not used in an object transformer. The intuition behind this treatment is simple: each statement should contain a logically inseparable action in a well-maintained project for best readability and maintainability6 . Gadgets are extracted by iterating over all statements in all application classes in both the old and new version source code using the following rules: 1) For a statement’s associated expression (i.e., the righthand side of an assignment or an if/while condition), we parse it into an abstract syntax tree (AST) and replace each variable or constant node with a placeholder i to be a gadget. This rule yields g1, g2, and g5 in Figure 5. 2) For each statement, we consider its contained constants potentially useful in synthesizing object transformers. Therefore, each constant literal in the statement is also extracted as a gadget. This rule yields g3 in Figure 5. 3) For each statement, we also extract it into a gadget where only variable nodes are replaced by placeholders, i.e., keeping all constants as-is in the gadget compared with the first rule. This is because constants may be inseparable from the statement’s computational logic. This rule yields g4 in Figure 5. 4) For each class in both old and new versions of the source code, we extract class field gadgets h.fieldNamei, 6There can be occasions that a statement consists of multiple actions, e.g., a chain of method invocations. We optimistically believe that the desired action will independently appear elsewhere in the codebase. g1(1) = h1.socketi g2(1, 2) = h1.getIOChannel().keyFor( 2.getPoller().getSelector())i g3(1) = h(KeyAttachment) 1.attachment()i 1 class DSUHelper { 2 static void transform(SocketProcessor? obj, SocketProcessor stale) { 3 v1 = (NioChannel) g1(stale); 4 if (v1 != null) { 5 v2 = (SelectionKey) g2(v1, v1); 6 if (v2 != null) { 7 v3 = (KeyAttachment) g3(v2); 8 } 9 } 10 obj.ka = v3; 11 } 12 } Fig. 6: A field transformer for field ka in the motivating example written in our DSL, and extracted gadgets. All used variables were initialized with default values, e.g., v3 = null. Algorithm 1: The transformer synthesis framework 1 Function SYNTHESIS(G) 2 Q ← {|.}; 3 while Q 6= ∅ do 4 p ← arg min p0∈Q cost(p 0 ); 5 if |. /∈ p then 6 yield p; 7 Q ← Q ∪ DG(p) \ {p} class method gadgets h.methodName(~)i, static field gadgets hClassName.fieldNamei, static method gadgets hClassName.methodName(~)i, and object creation gadgets hnew ClassName(~)i. These rules are also additionally applied for the Java Standard Library for extracting potentially useful API calls, e.g., container operations. Figure 6 gives a transformer example for field ka in our motivating example (Figure 1). Three used gadgets g1, g2, and g3 are extracted using rules #4, #1, and #1, respectively. This transformer is equivalent to the manually provided one in Figure 1. One could expect that our DSL and gadget extraction rules suffice for object transformer synthesis. Unfortunately, there can be millions of gadgets extracted from a large codebase. Certainly, not all gadget combinations are equally relevant to a given upgrade. The relevance of a gadget to the class change and the similarity between a gadget combination and existing source code would serve as the guidance for efficient object transformer synthesis, which is described as follows. V. AUTOMATIC SYNTHESIS OF OBJECT TRANSFORMERS Conceptually, object transformer synthesis is simple: a systematic enumeration of all syntactically correct programs will eventually find a correct transformer (Section V-A). This section presents our heuristic search algorithm for efficiently pri-
oritizing correct and developer-readable (simple)transformers enumerated (over vEV(op)\{stale}).Then,the descendants to make the search procedure practical(Sections V-B to V-D).of p can be defined as A.Synthesis Framework DG(p)={pe}U {pstmt stmt ES). Given a set of gadgets G,our object transformer synthesis Though conceptually simple,it is a challenge to scale the algorithm listed in Algorithm 1 is essentially a straightforward search.The key treatments we made to boost the search include syntax-directed search.It maintains a work list O consisting of decomposing an object transformer into independent field candidate synthesized programs.A program p EQ is an object transformers(Section V-B),pruning likely irrelevant gadgets transformation DSL program(syntax defined in Figure 4)with (Section V-C),and boosting the search by measuring the zero or more insertion marksD in which more statements can “naturalness'”ofp(Section V-D). be filled7.Starting from the initial program that consists of a single insertion mark in O (Line 2),the algorithm iteratively B.Object Transformer as Independent Field Transformers pops the program p of the minimum cost in Q for a step of An object transformer is responsible for transforming all expansion(Lines 3-4).If p contains no insertion mark,we find fields in an object,and an object transformer for large classes a potentially useful transformer for further validation (Lines may be lengthy and difficult to synthesize.Fortunately,the 5-6).Otherwise,there must be an insertion mark in p and program is frozen at update-time and each field's value should the algorithm expands the first insertion mark to obtain more be computed by a pure function over the update-time heap candidate programs (Line 7). snapshot. The expansion step defines DG(p),the descendant programs Therefore,the transformation for fields in an object can be of p over a set of gadgets G.Let op be the first occurrence independently conducted and the object transformation problem of insertion mark in p.A step of expansion either closes (i.e., is essentially equivalent to the field transformation ones.In removes)the insertion mark op to obtain practice,a developer or upgrade maintainer simply specifies which fields may require a non-default transformer,and the Pe=pop→, synthesis algorithm will produce a series of candidate field or prepends it with a statement stmt to obtain transformers for further validation (e.g.,independently tested) This is a standard treatment of existing techniques [15,27,28]. Pstmt =plop stmt ] C.Pruning Irrelevant Gadgets In the latter case,let V(op)={stale,v1,v2,...}be all There can be millions of gadgets for a large program variables within the lexical scope of the insertion mark op resulting in a huge Ec(op).Fortunately,we observed that in p.For a gadget gG of n placeholders,the set of its all most of the gadgets are irrelevant to the field to be transformed possible applications at the insertion mark is defined by in terms of the upgrade.Particularly,class B is relevant to A Eg(op)={g(u,2,,vn)|∈V(on)for1≤i≤n}. if either:B is a (sub)class of A,B contains a field of type A,or a method in B refers to A (e.g.in the parameter list All syntactically valid expressions at o are thus or a local variable,etc.).To synthesize a field transformer for field f in class A (e.g.,A=SocketProcessor and f ka in Figure 1).we restrict the concerned gadget set G to be: EG(Op)=V(op 1)all gadgets in A, 2)all gadgets in any class relevant to A,and To prepend a statement stmt at o,it must be either of the 3)all gadgets in any class relevant to f.class. following four patterns according to the syntax in Figure 4: Finally,almost all classes are relevant to primitive types (e.g., 1)(variable assignment)v=e; int,bool,etc.)and java.lang.String.We do not apply the 2)(field transformation)obj.f =e; third rule in synthesizing field transformer for these types,as 3)(if-branch)if (c){s*else s* otherwise,the search would be intractable. 4)(while-loop)while (c){s* D.Boosting the Search Also recall that we limit the form of a condition c to be either A naive implementation of Algorithm 1 will be overwhelmed of fe,!e,e ==null,e !null).Therefore,stmt must by the huge search space.For example,Eg(op)grows consist of exactly one expression.Enumerating the expressions exponentially over increased V and In our search eE Ec(op)and filling e into the above code patterns yields implementation,we only fill a placeholder i with a type- S,the set of all possible statements to prepend to op.For compatible variable in the scope that will not trivially throw an assignment v e.the left-hand side variable v is also an exception (e.g.,invoking a null reference's method).This 7Given a program and an insertion mark in it,one can prepend a statement treatment yields a manageable E(p)in practice. before the insertion mark to obtain a new program.is equivalent to s*in The key to the success of our search algorithm is a cost the syntax. function cost to prioritize:(1)simple programs with few basic
oritizing correct and developer-readable (simple) transformers to make the search procedure practical (Sections V-B to V-D). A. Synthesis Framework Given a set of gadgets G, our object transformer synthesis algorithm listed in Algorithm 1 is essentially a straightforward syntax-directed search. It maintains a work list Q consisting of candidate synthesized programs. A program p ∈ Q is an object transformation DSL program (syntax defined in Figure 4) with zero or more insertion marks |. in which more statements can be filled7 . Starting from the initial program that consists of a single insertion mark |. in Q (Line 2), the algorithm iteratively pops the program p of the minimum cost in Q for a step of expansion (Lines 3–4). If p contains no insertion mark, we find a potentially useful transformer for further validation (Lines 5–6). Otherwise, there must be an insertion mark |. in p and the algorithm expands the first insertion mark to obtain more candidate programs (Line 7). The expansion step defines DG(p), the descendant programs of p over a set of gadgets G. Let σp be the first occurrence of insertion mark in p. A step of expansion either closes (i.e., removes) the insertion mark σp to obtain p = p[σp 7→ ], or prepends it with a statement stmt to obtain pstmt = p[σp 7→ stmt |.]. In the latter case, let V (σp) = {stale, v1, v2, . . .} be all variables within the lexical scope of the insertion mark σp in p. For a gadget g ∈ G of n placeholders, the set of its all possible applications at the insertion mark is defined by Eg(σp) = {g(v1, v2, . . . , vn) | vi ∈ V (σp) for 1 ≤ i ≤ n}. All syntactically valid expressions at σp are thus EG(σp) = V (σp) ∪ [ g∈G Eg(σp) . To prepend a statement stmt at σp, it must be either of the following four patterns according to the syntax in Figure 4: 1) (variable assignment) v = e; 2) (field transformation) obj.f = e; 3) (if-branch) if (c) { s ∗ } else { s ∗ } 4) (while-loop) while (c) { s ∗ } Also recall that we limit the form of a condition c to be either of {e, !e, e == null, e != null}. Therefore, stmt must consist of exactly one expression. Enumerating the expressions e ∈ EG(σp) and filling e into the above code patterns yields S, the set of all possible statements to prepend to σp. For an assignment v = e, the left-hand side variable v is also 7Given a program and an insertion mark in it, one can prepend a statement before the insertion mark to obtain a new program. |. is equivalent to s ∗ in the syntax. enumerated (over v ∈ V (σp)\{stale}). Then, the descendants of p can be defined as DG(p) = {p} ∪ {pstmt | stmt ∈ S}. Though conceptually simple, it is a challenge to scale the search. The key treatments we made to boost the search include decomposing an object transformer into independent field transformers (Section V-B), pruning likely irrelevant gadgets (Section V-C), and boosting the search by measuring the “naturalness” of p (Section V-D). B. Object Transformer as Independent Field Transformers An object transformer is responsible for transforming all fields in an object, and an object transformer for large classes may be lengthy and difficult to synthesize. Fortunately, the program is frozen at update-time and each field’s value should be computed by a pure function over the update-time heap snapshot. Therefore, the transformation for fields in an object can be independently conducted and the object transformation problem is essentially equivalent to the field transformation ones. In practice, a developer or upgrade maintainer simply specifies which fields may require a non-default transformer, and the synthesis algorithm will produce a series of candidate field transformers for further validation (e.g., independently tested). This is a standard treatment of existing techniques [15, 27, 28]. C. Pruning Irrelevant Gadgets There can be millions of gadgets for a large program, resulting in a huge |EG(σp)|. Fortunately, we observed that most of the gadgets are irrelevant to the field to be transformed in terms of the upgrade. Particularly, class B is relevant to A if either: B is a (sub)class of A, B contains a field of type A, or a method in B refers to A (e.g. in the parameter list or a local variable, etc.). To synthesize a field transformer for field f in class A (e.g., A = SocketProcessor and f = ka in Figure 1), we restrict the concerned gadget set G to be: 1) all gadgets in A, 2) all gadgets in any class relevant to A, and 3) all gadgets in any class relevant to f.class. Finally, almost all classes are relevant to primitive types (e.g., int, bool, etc.) and java.lang.String. We do not apply the third rule in synthesizing field transformer for these types, as otherwise, the search would be intractable. D. Boosting the Search A naive implementation of Algorithm 1 will be overwhelmed by the huge search space. For example, |Eg(σp)| grows exponentially over increased |V | and |~|. In our search implementation, we only fill a placeholder i with a typecompatible variable in the scope that will not trivially throw an exception (e.g., invoking a null reference’s method). This treatment yields a manageable |Eg(σp)| in practice. The key to the success of our search algorithm is a cost function cost to prioritize: (1) simple programs with few basic
constructs,and (2)natural programs that maximally reuse VI.EVALUATION existing method-local data flows in the source code. We implemented the synthesis algorithm as the PASTA tool, To measure the naturalness,we perform an intra-procedural which consists of ~15,000 lines of Java codes.The source code forward slicing [35]for each extracted gadget g to obtain parsing and extracting gadgets was implemented over Java- Parser [36]and Javassist [37].In this section,we experimentally slice(g)=[g1,92,·.J, compare PASTA with two state-of-the-art techniques using real- the sequence of gadgets(appearing in their statement order) world updates from Apache Tomcat 8 and Apache FtpServer. in which each gadget g'e slice(g)data-depends on g. We elaborate on the experimental setup in Section VI-A and Given a program p and its used gadgets Gp=[g1,92,..., present the evaluation results in Section VI-B. gn],we argue that the programs that better reuse consecutive A.Experimental Setup gadgets in a slice (i.e.,existing data flow)are more likely to be natural.Formally,for each gadget gi Gp,we find its maximal The three transformer generation techniques under evaluation containing slice Sp(gi).the slice(g;)of maximum length are:our PASTA,program-synthesis-based TOS [27],and trace- satisfying that for giGp and slice(gi)slice(gi).The set synthesis-based AOTES [28].We evaluate these techniques of all maximal containing slices is thus 5p). against a set of class changes that require non-trivial object We measure the naturalness of p by calculating the average transformation from widely-used server applications undergoing data-flow similarity for all maximal containing slices: long-term development and maintenance.As discussed in Section V-B,each field in an object can be independently 5(p.G)=Spl ses, 1 LCS(Gp S,Si:) transformed.Therefore,the evaluation subjects (first two max s1sisjsislmax [Gp S],Sijl}' columns in Table I)consist of: 1)The 22 Apache Tomcat 8 field updates studied in the in which LCS is the longest common subsequence,and Gp [S empirical study (Section II); is the subsequence of Gp obtained by removing any element 2)Another 4 Apache FtpServer field updates selected fol- not in S.The intuition behind this formula is that a program p lowing the same collection methodology of our empirical of a high 6 should look like a "blending"of gadgets in short study. slice sequences. For Apache FtpServer,we uniform-randomly sampled 30 Finally,we add the measurement of simplicity to the cost commits (consisting of 75 class changes)from all 244 commits function cost.Recall that our synthesis algorithm(Algorithm 1) with class changes in the entire maintenance history of Apache either closes an insertion mark,or expands it by prepending a FtpServer (from 1.0.0 to the latest 1.0.6).In the 75 class statement in an iteration: changes,71 can be dynamically updated via a trivial object transformer.The rest four non-trivial cases(4 field updates) pe=plop→(or Pstmt=plop→stmt]: were all used as our experimental subjects. Thus we define the cost function to be cost(pe)=cost(p)and To validate a transformer,we for each changed class provide sufficiently strong DSU test cases that can cover all locations writing to the changed field(by revising existing test cases or cost(Pstmt ·mnk(p' manually providing one whenever necessary).For each test 6(Pstmt)+入stmt case,we also specify an active-safe [10,15,17,31]dynamic update-safe point,at which all changed methods in the changed where H is all partial programs in the process of generating p class are popped off stack.Each test case also checks the object (by applying the two rewriting rules of the insertion mark),p' state consistency after the update point.These checkers take the denotes the number of statements (assignment,branch,or loop) heap snapshots before and after object transformation as inputs in p',and rank(p')is the rank of p'in terms of naturalness (6) and determine whether the object transformation is successful. within all generated siblings in the expansion step in obtaining We use the same test cases for all the evaluated techniques. p'.Using rank in the cost function reflects the intuition that The evaluation settings are: one should not only favor short (and thus simple)synthesized 1)For PASTA,the search depth is set to 10(sufficiently programs,but also favor those programs whose generation large to handle all studied transformers).We automatically processes are mostly natural. validated each synthesized transformer by running the The cost function also contains a mechanism in the DSU test cases and manually checked the test-passing ones formula)for giving likely more relevant statements with extra until a semantically correct field transformer is produced. credits.Our implementation adopts a simple rule that lets 2)For TOS,we followed the evaluation steps in its paper [27]. Astmt =0.2 if stmt involves any changed field (e.g.,Lines TOS requires heap snapshots for field transformer synthe- 6-8 in Figure 1)or the corresponding gadget is from the sis.Thus we used our test cases for field transformation changed code (e.g.,Line 11 in Figure 1).Future use of this correctness validation to produce these snapshots.The mechanism can be assigning human-provided gadgets with a higher priority. 8Available at https://zelinzhao.github.io/pasta
constructs, and (2) natural programs that maximally reuse existing method-local data flows in the source code. To measure the naturalness, we perform an intra-procedural forward slicing [35] for each extracted gadget g to obtain slice(g) = [g1, g2, . . .], the sequence of gadgets (appearing in their statement order) in which each gadget g 0 ∈ slice(g) data-depends on g. Given a program p and its used gadgets Gp = [g1, g2, . . . , gn], we argue that the programs that better reuse consecutive gadgets in a slice (i.e., existing data flow) are more likely to be natural. Formally, for each gadget gi ∈ Gp, we find its maximal containing slice Sp(gi), the slice(gj ) of maximum length satisfying that for gj ∈ Gp and slice(gi) ⊆ slice(gj ). The set of all maximal containing slices is thus Sp = S 1≤i≤n Sp(gi). We measure the naturalness of p by calculating the average data-flow similarity for all maximal containing slices: δ(p, G) = 1 |Sp| X S∈Sp max 1≤i≤j≤|S| LCS(Gp S, Si:j ) max {|Gp S|, |Si:j |}, in which LCS is the longest common subsequence, and Gp S is the subsequence of Gp obtained by removing any element not in S. The intuition behind this formula is that a program p of a high δ should look like a “blending” of gadgets in short slice sequences. Finally, we add the measurement of simplicity to the cost function cost. Recall that our synthesis algorithm (Algorithm 1) either closes an insertion mark, or expands it by prepending a statement in an iteration: p = p[σp 7→ ] or pstmt = p[σp 7→ stmt |.]. Thus we define the cost function to be cost(p) = cost(p) and cost(pstmt) = X p0∈H |p 0 | · rank(p 0 ) · 1 δ(pstmt) + λstmt , where H is all partial programs in the process of generating p (by applying the two rewriting rules of the insertion mark), |p 0 | denotes the number of statements (assignment, branch, or loop) in p 0 , and rank(p 0 ) is the rank of p 0 in terms of naturalness (δ) within all generated siblings in the expansion step in obtaining p 0 . Using rank in the cost function reflects the intuition that one should not only favor short (and thus simple) synthesized programs, but also favor those programs whose generation processes are mostly natural. The cost function also contains a mechanism (λ in the formula) for giving likely more relevant statements with extra credits. Our implementation adopts a simple rule that lets λstmt = 0.2 if stmt involves any changed field (e.g., Lines 6–8 in Figure 1) or the corresponding gadget is from the changed code (e.g., Line 11 in Figure 1). Future use of this mechanism can be assigning human-provided gadgets with a higher priority. VI. EVALUATION We implemented the synthesis algorithm as the PASTA tool, which consists of ∼15,000 lines of Java code8 . The source code parsing and extracting gadgets was implemented over JavaParser [36] and Javassist [37]. In this section, we experimentally compare PASTA with two state-of-the-art techniques using realworld updates from Apache Tomcat 8 and Apache FtpServer. We elaborate on the experimental setup in Section VI-A and present the evaluation results in Section VI-B. A. Experimental Setup The three transformer generation techniques under evaluation are: our PASTA, program-synthesis-based TOS [27], and tracesynthesis-based AOTES [28]. We evaluate these techniques against a set of class changes that require non-trivial object transformation from widely-used server applications undergoing long-term development and maintenance. As discussed in Section V-B, each field in an object can be independently transformed. Therefore, the evaluation subjects (first two columns in Table I) consist of: 1) The 22 Apache Tomcat 8 field updates studied in the empirical study (Section III); 2) Another 4 Apache FtpServer field updates selected following the same collection methodology of our empirical study. For Apache FtpServer, we uniform-randomly sampled 30 commits (consisting of 75 class changes) from all 244 commits with class changes in the entire maintenance history of Apache FtpServer (from 1.0.0 to the latest 1.0.6). In the 75 class changes, 71 can be dynamically updated via a trivial object transformer. The rest four non-trivial cases (4 field updates) were all used as our experimental subjects. To validate a transformer, we for each changed class provide sufficiently strong DSU test cases that can cover all locations writing to the changed field (by revising existing test cases or manually providing one whenever necessary). For each test case, we also specify an active-safe [10, 15, 17, 31] dynamic update-safe point, at which all changed methods in the changed class are popped off stack. Each test case also checks the object state consistency after the update point. These checkers take the heap snapshots before and after object transformation as inputs and determine whether the object transformation is successful. We use the same test cases for all the evaluated techniques. The evaluation settings are: 1) For PASTA, the search depth is set to 10 (sufficiently large to handle all studied transformers). We automatically validated each synthesized transformer by running the DSU test cases and manually checked the test-passing ones until a semantically correct field transformer is produced. 2) For TOS, we followed the evaluation steps in its paper [27]. TOS requires heap snapshots for field transformer synthesis. Thus we used our test cases for field transformation correctness validation to produce these snapshots. The 8Available at https://zelinzhao.github.io/pasta
TABLE I:Evaluation results on the real-world updates in Apache Tomcat 8 and Apache FtpServer. ID Subject Changedclass.changedField (FieldType) #Gadgets Tests PASTA AOTES TOS Tomcat-6a940d StandardContext.path (String) 2.373 2 1/336(20.0) Ix M→Sx Tomcat-ec8dff ContextConfig.context (Context) 28.232 X I→HX M→S× 3 Tomcat-f84800 WsHttpUpgradeHandler.wsSession (WsSession) 3.098 2 X I→H× M→S× Tomcat-a752f3 AsyncContextImpl.request (Request) 14.927 I→H× M→S× 5 Tomcat-c0d4f7 WsHandshakeResponse.headers (CaseInsensitiveKeyMap) 633 1/3167.6m)W I→H× M→S× 6 Tomcat-c0d4f7 PojoMethodMapping.onMessage (List) 38.475 2 1/14(0.3m) I→Hx M→S× 7 Tomcat-dbb784 SenderState.memberStates (ConcurrentHashMap) 37,234 3 1/7(0.2m) I→Hx M→S× 8 Tomcat-a8d16b MemberImpl.msgCount (AtomicInteger) 6.962 2 1/13(0.2m) M→SX 9 Tomcat-358f94 StandardContext.parameters (ConcurrentHashMap) 19.805 2 1/45(1.8m) I→HX M→S× 10 Tomcat-ad012e FutureToSendHandler.result (AtomicReference) 1,468 2 1/4(0.1m)/ IHx M→S× 11 Tomcat-c0d4f7 WsOutputStream.used (boolean) 76 2 I→H0 M-→S 12 Tomcat-c0d4f7 Wswriter.used (boolean) 77 0 I→H@ M-→SQ 13 Tomcat-dbla6e StandardContext.useRelativeRedirects (boolean) 2.413 2 I→H@ M-SO 14 Tomcat-69196d StoreConfigLifecycleListener.oname (ObjectName) 12.783 1/1(0.1m) I→H M→S× 15 Tomcat-2e7c68 Nio2Endpoint.threadGroup (AsynchronousChannelGroup) 3.116 2 1/144(14.0m) Ix M→S× 16 Tomcat-d8ad3c DefaultServlet.showServerInfo (boolean) 1.000 1/3(0.1m) I→HX M→SX Tomcat-5952de WebappServiceLoader.context (Context) 32.325 2 I→H× M→S× 18 Tomcat-6b64bb StandardContext.noPluggabilityListeners (Set) 40.592 1/77(8.7m) I M→Sx 19 Tomcat-6b64bb StandardContext.noPlug...Context (NoPlug .Context) 12.865 I→HX M-→Sx 20 Tomcat-766c9e AprSocketWrapper.endpoint (AprEndpoint) 124.569 2 I→H× M-Sx 21 Tomcat-f4451c SocketProcessor.ka (KeyAttachment) 2.430 3 1/4(0.3m) I→Hx M-→S× 22 Tomcat-4355ed StandardContext.applicationEventListenersList (List) 50.274 3 1/27(1.4m) I→H→RX M-→S× 23 FtpServer-faal53 MinaListener.acceptor (SocketAcceptor) 469 2 1/14(0.3m) I→Hx M→Sx 24 FtpServer-32edob FtpServer.serverContext (FtpServerContext) 1,789 2 1/7(0.2m) I→HX M→Sx 25 FtpServer-1b2ea6 PropertiesUserManager.isConfigured (boolean) 249 2 I→H M→S× 26 FtpServer-afffc8 FileIpRestrictor.file (File) 1.634 2 1/154(2.5m) M→S× Summary 439,868 58 16(61.5%) 2(7.7%) 0(0.0%) For PASTA,"m/n"denotes that (1)the first semantically correct transformer is in the m-th place among all test-passing transformers;(2)there are n test-runs before the first semantically correct transformer is gencrated.The number in the bracket (t)indicates that the first correct transformer is produced after t minutes.For AOTES,"Ix" “I→Hx",and“I→H→Rx"denote failing at mutator generation,execution history synthesis,and trace replay,respectively.For TOS,.“Mx" and“M→S×”denote failing at object matching and transformer synthesis,respectively.denotes that this update requires a human-provided configuration (thus is out of the scope of automatic transformer synthesis).We included these cases in the evaluation for completeness:they are indeed non-trivial and may potentially be addressed in future work (e.g.,via providing a gadget by the upgrade maintainer). (a)y-axis:Running time (s) (b)y-axis:Peak memory (MB) (c)y-axis:Number of transformers X615 X9T51 2.000 x×X/r××xr/x 5,000 6,000 5,000 1.500 4,000 4,000 3,000 1.000 3.000 2,000 2,000 500 1,000 1,000 10 15 20 25 10 15 20 25 10 15 20 25 目Synthesis time☐Testing time ☐Memory consumed ☐Test--failing Testing-passing Fig.7:Statistics of running time,peak memory,and statistics of testing results.The x-axis denotes subject IDs in Table I. test cases provide sufficiently informative traces for a time limit.TOS and AOTES did not time out for all experimental human DSU expert to derive correct field transformers.subjects.All experiments were conducted on a commodity PC We also manually checked each synthesized transformer with a quad-core Intel Core i7-4770 CPU and 32 GB RAM for semantic correctness. running Ubuntu Linux 18.04. 3)For AOTES,as it does not produce any human-readable transformer,we consider it correct if all test cases passed B.Evaluation Results even if it might fail in other online transformations. Overall Results.The major evaluation results are shown We also inspected the synthesized method histories for in Table I.PASTA produced 16/26 (61.5%)correct field manually diagnosing the root cause in case of a failure. transformers,or 15/25 (60.0%)correct object transformers For each field transformation,we set a moderate 30-minute (because Subjects #18 and #19 are from the same class).All succeeded cases are amazingly top-1 hits,i.e.,the most"natural" 9Conversely.AOTES may have a chance to correctly transform the heap in test-passing transformer is semantically correct.This result practice even if it fails on some test cases.We argue that using AOTES in this case is considerably risky because AOTES is a black-box technique that supports our previous claim that the test cases are sufficiently silently transforms the heap. strong.Compared with the best existing techniques TOS(failed
TABLE I: Evaluation results on the real-world updates in Apache Tomcat 8 and Apache FtpServer. ID Subject ChangedClass.changedField (FieldType) #Gadgets #Tests PASTA AOTES TOS 1 Tomcat-6a940d StandardContext.path (String) 2,373 2 1 / 336 (20.0m) X I× MS× 2 Tomcat-ec8dff ContextConfig.context (Context) 28,232 3 × IH× MS× 3 Tomcat-f84800 WsHttpUpgradeHandler.wsSession (WsSession) 3,098 2 × IH× MS× 4 Tomcat-a752f3 AsyncContextImpl.request (Request) 14,927 2 × IH× MS× 5 Tomcat-c0d4f7 WsHandshakeResponse.headers (CaseInsensitiveKeyMap) 633 3 1 / 316 (7.6m) X IH× MS× 6 Tomcat-c0d4f7 PojoMethodMapping.onMessage (List) 38,475 2 1 / 14 (0.3m) X IH× MS× 7 Tomcat-dbb784 SenderState.memberStates (ConcurrentHashMap) 37,234 3 1 / 7 (0.2m) X IH× MS× 8 Tomcat-a8d16b MemberImpl.msgCount (AtomicInteger) 6,962 2 1 / 13 (0.2m) X X MS× 9 Tomcat-358f94 StandardContext.parameters (ConcurrentHashMap) 19,805 2 1 / 45 (1.8m) X IH× MS× 10 Tomcat-ad012e FutureToSendHandler.result (AtomicReference) 1,468 2 1 / 4 (0.1m) X IH× MS× 11 Tomcat-c0d4f7 WsOutputStream.used (boolean) 76 2 IH MS 12 Tomcat-c0d4f7 WsWriter.used (boolean) 77 2 IH MS 13 Tomcat-db1a6e StandardContext.useRelativeRedirects (boolean) 2,413 2 IH MS 14 Tomcat-69196d StoreConfigLifecycleListener.oname (ObjectName) 12,783 2 1 / 1 (0.1m) X IH× MS× 15 Tomcat-2e7c68 Nio2Endpoint.threadGroup (AsynchronousChannelGroup) 3,116 2 1 / 144 (14.0m) X I× MS× 16 Tomcat-d8ad3c DefaultServlet.showServerInfo (boolean) 1,000 2 1 / 3 (0.1m) X IH× MS× 17 Tomcat-5952de WebappServiceLoader.context (Context) 32,325 2 × IH× MS× 18 Tomcat-6b64bb StandardContext.noPluggabilityListeners (Set) 40,592 3 1 / 77 (8.7m) X I× MS× 19 Tomcat-6b64bb StandardContext.noPlug...Context (NoPlug...Context) 12,865 2 × IH× MS× 20 Tomcat-766c9e AprSocketWrapper.endpoint (AprEndpoint) 124,569 2 × IH× MS× 21 Tomcat-f4451c SocketProcessor.ka (KeyAttachment) 2,430 3 1 / 4 (0.3m) X IH× MS× 22 Tomcat-4355ed StandardContext.applicationEventListenersList (List) 50,274 3 1 / 27 (1.4m) X IHR× MS× 23 FtpServer-faa153 MinaListener.acceptor (SocketAcceptor) 469 2 1 / 14 (0.3m) X IH× MS× 24 FtpServer-32ed0b FtpServer.serverContext (FtpServerContext) 1,789 2 1 / 7 (0.2m) X IH× MS× 25 FtpServer-1b2ea6 PropertiesUserManager.isConfigured (boolean) 249 2 × IH× MS× 26 FtpServer-afffc8 FileIpRestrictor.file (File) 1,634 2 1 / 154 (2.5m) X X MS× Summary 439,868 58 16 (61.5%) 2 (7.7%) 0 (0.0%) For PASTA, “m / n” denotes that (1) the first semantically correct transformer is in the m-th place among all test-passing transformers; (2) there are n test-runs before the first semantically correct transformer is generated. The number in the bracket (t m ) indicates that the first correct transformer is produced after t minutes. For AOTES, “I×”, “IH×”, and “IHR×” denote failing at mutator generation, execution history synthesis, and trace replay, respectively. For TOS, “M×” and “MS×” denote failing at object matching and transformer synthesis, respectively. denotes that this update requires a human-provided configuration (thus is out of the scope of automatic transformer synthesis). We included these cases in the evaluation for completeness: they are indeed non-trivial and may potentially be addressed in future work (e.g., via providing a gadget by the upgrade maintainer). 5 10 15 20 25 0 500 1,000 1,500 2,000 X×××XXXXX X ×××XXX×X××XXXX×X (a) y-axis: Running time (s) Synthesis time Testing time 5 10 15 20 25 0 1,000 2,000 3,000 4,000 5,000 X ××× X XXX X X ×× × X X X ×X × ×6,186 X X X X × X (b) y-axis: Peak memory (MB) Memory consumed 5 10 15 20 25 0 1,000 2,000 3,000 4,000 5,000 6,000 X × × × X X X X X X ×× × X X X× X × × X X XX ×9,755 X (c) y-axis: Number of transformers Test-failing Testing-passing Fig. 7: Statistics of running time, peak memory, and statistics of testing results. The x-axis denotes subject IDs in Table I. test cases provide sufficiently informative traces for a human DSU expert to derive correct field transformers. We also manually checked each synthesized transformer for semantic correctness. 3) For AOTES, as it does not produce any human-readable transformer, we consider it correct if all test cases passed even if it might fail in other online transformations9 . We also inspected the synthesized method histories for manually diagnosing the root cause in case of a failure. For each field transformation, we set a moderate 30-minute 9Conversely, AOTES may have a chance to correctly transform the heap in practice even if it fails on some test cases. We argue that using AOTES in this case is considerably risky because AOTES is a black-box technique that silently transforms the heap. time limit. TOS and AOTES did not time out for all experimental subjects. All experiments were conducted on a commodity PC with a quad-core Intel Core i7-4770 CPU and 32 GB RAM running Ubuntu Linux 18.04. B. Evaluation Results Overall Results. The major evaluation results are shown in Table I. PASTA produced 16/26 (61.5%) correct field transformers, or 15/25 (60.0%) correct object transformers (because Subjects #18 and #19 are from the same class). All succeeded cases are amazingly top-1 hits, i.e., the most “natural” test-passing transformer is semantically correct. This result supports our previous claim that the test cases are sufficiently strong. Compared with the best existing techniques TOS (failed
for all cases)and AOTES(2/26,7.7%),PASTA made automatic challenging to assemble gadgets scattered in different parts of non-trivial object transformation in DSU significantly more a program(i.e.,with a relatively low naturalness score),which practical. is required in synthesizing field transformers in these cases Like PASTA,TOS is also a syntax-guided synthesis.However, Nevertheless,PASTA as a prototype implementation points out the TOS DSL contains only a subset of basic Java language a promising research direction that reuses existing code in constructs.Consequently,expressing a practical non-trivial object transformation. object transformer would require an unrealistically large search One may also wonder whether our treatment for pruning depth.For example,the most complex transformer produced irrelevant gadgets for primitive types and String (Section V-C) by PASTA (#23)consists of a field access.an if condition,two should be considered proper.Excluding the three out-of-scope different string literals,two constructor calls,and four instance cases (#11-13).PASTA successfully synthesized 2/3 (#1,16,25) method calls.There is also a transformer (#18)with a branch of the cases.For the only failing case (#25),the semantically in a loop.It is of no surprise that TOS failed on these practical correct transformer sets the field to be true only when two subjects even with the aid of test traces. conditions are simultaneously satisfied.These two conditions AOTES succeeded only for Subjects #8 and #26 because their were correctly identified as gadgets by PASTA,however,it failed inverse histories happened to be relatively simple (AOTES's to find them within the time limit.Therefore,our aggressive inherent requirement on successful history synthesis).The policy for pruning irrelevant gadgets should be considered major failure cause for AOTES is generating an incorrect history proper for primitive types. (20/26,76.9%),which is the major drawback hard to avoid for Considering the resource consumption for conducting object a runtime state transformer. Unlike TOS and AOTES in which a correct transformer may transformation,a 30-minute time limit should be considered be too complex to correctly synthesize,PASTA"shortcuts"the reasonable for production usel.Figure 7(a)and(b)display solution by gluing gadgets using a simpler DSL focused on the statistics of running time and memory consumption, code reuse.For example,AOTES generated a long method call respectively.Among all succeeded cases,9/16(56.2%)returned the first semantically correct transformer within one minute history for Subject #8,whereas PASTA found a constructor call to fulfill the same functionality.The evaluation results and 14/16(87.5%)were within 10 minutes.Both the search algorithm and test validation can also be parallelized to further suggest that the code-reuse in PASTA could become an effective approach to object transformer synthesis. accelerate the implementation.However,these engineering issues are not the major focus of this paper.In terms of memory, Detailed Analysis.Based on the type of code change,a field PASTA used less than 4 GB memory for 22/26(84.6%)of the change can be either of: subjects.Subject #20 consumed the most memory (~6 GB) A field type change with field name unchanged(7 cases; on time out due to its large number of gadgets (124,569).The #5-10 and #26),which was handled best by PASTA.All 7 cases overall results can be considered acceptable for production use. were successfully transformed.A type-changed field usually plays a similar role in both old and new versions.They can Finally,to our surprise,our search algorithm even found a simpler field transformer for field ka of the motivating example likely be used interchangeably somewhere in the source code Thus,there may exist code snippets to retrieve the information in Figure 1 (Subject #21): contained in this field.which can be used in a transformer. class DSUHelper static void transform(SocketProcessor*obj,SocketProcessor stale){ A field value change with field type and name unchanged obj.ka null; (6 cases:#1-4 and 23-24).PASTA succeeded in 3/6(50%)of if (stale.socket !null) them.Value change indicates that the semantics of an object obj.ka stale.socket.getAttachment(false); 6 is changed in the update.Since our approach is semantics- 7] unaware,PASTA has to perform a brute-force search across all The gadget 1.getAttachment(2)in Line 5 was origi- potentially useful transformers. nally used for creating an NioChannel object's corresponding A new field (13 cases;#11-22 and #25).This includes KeyAttachment,which was passed to by a true argument renaming a field with a type change,which cannot be (thus was considered irrelevant in our manual transformer objectively distinguished from adding a new field (e.g.,ka construction).However,its behavior of returning an existing in Figure 1).Excluding the three out-of-scope cases (explained KeyAttachment object is perfectly correct for our expected in the footnote of Table D).PASTA succeeded in 6/10(60%)of object transformation.The transformer in Figure 1 was ranked them.For a similar reason of the"field value change"category, in the 5th position,which also passed the tests within the time PASTA is essentially an exhaustive enumeration for this category limit of field change. For all failing cases excluding the out-of-scope ones,we Implications.Both the empirical study and evaluation results found that our DSL and PASTA's extracted gadgets suffice show that the unavailability of non-trivial object transformers to construct a correct transformer.However,PASTA was not should be recognized as a major obstacle that hinders the able to do sufficiently many explorations to identify them application of DSU in practice.Considering updating Tomcat 8 within the given time limit:the failing cases on average 1Gadget extraction time is less than 5 minute for both Tomcat and FpServer. tested only 2.677 candidate transformers.It is considerably We did not count such pre-processing time
for all cases) and AOTES (2/26, 7.7%), PASTA made automatic non-trivial object transformation in DSU significantly more practical. Like PASTA, TOS is also a syntax-guided synthesis. However, the TOS DSL contains only a subset of basic Java language constructs. Consequently, expressing a practical non-trivial object transformer would require an unrealistically large search depth. For example, the most complex transformer produced by PASTA (#23) consists of a field access, an if condition, two different string literals, two constructor calls, and four instance method calls. There is also a transformer (#18) with a branch in a loop. It is of no surprise that TOS failed on these practical subjects even with the aid of test traces. AOTES succeeded only for Subjects #8 and #26 because their inverse histories happened to be relatively simple (AOTES’s inherent requirement on successful history synthesis). The major failure cause for AOTES is generating an incorrect history (20/26, 76.9%), which is the major drawback hard to avoid for a runtime state transformer. Unlike TOS and AOTES in which a correct transformer may be too complex to correctly synthesize, PASTA “shortcuts” the solution by gluing gadgets using a simpler DSL focused on code reuse. For example, AOTES generated a long method call history for Subject #8, whereas PASTA found a constructor call to fulfill the same functionality. The evaluation results suggest that the code-reuse in PASTA could become an effective approach to object transformer synthesis. Detailed Analysis. Based on the type of code change, a field change can be either of: A field type change with field name unchanged (7 cases; #5–10 and #26), which was handled best by PASTA. All 7 cases were successfully transformed. A type-changed field usually plays a similar role in both old and new versions. They can likely be used interchangeably somewhere in the source code. Thus, there may exist code snippets to retrieve the information contained in this field, which can be used in a transformer. A field value change with field type and name unchanged (6 cases; #1–4 and 23–24). PASTA succeeded in 3/6 (50%) of them. Value change indicates that the semantics of an object is changed in the update. Since our approach is semanticsunaware, PASTA has to perform a brute-force search across all potentially useful transformers. A new field (13 cases; #11–22 and #25). This includes renaming a field with a type change, which cannot be objectively distinguished from adding a new field (e.g., ka in Figure 1). Excluding the three out-of-scope cases (explained in the footnote of Table I), PASTA succeeded in 6/10 (60%) of them. For a similar reason of the “field value change” category, PASTA is essentially an exhaustive enumeration for this category of field change. For all failing cases excluding the out-of-scope ones, we found that our DSL and PASTA’s extracted gadgets suffice to construct a correct transformer. However, PASTA was not able to do sufficiently many explorations to identify them within the given time limit: the failing cases on average tested only 2,677 candidate transformers. It is considerably challenging to assemble gadgets scattered in different parts of a program (i.e., with a relatively low naturalness score), which is required in synthesizing field transformers in these cases. Nevertheless, PASTA as a prototype implementation points out a promising research direction that reuses existing code in object transformation. One may also wonder whether our treatment for pruning irrelevant gadgets for primitive types and String (Section V-C) should be considered proper. Excluding the three out-of-scope cases (#11–13), PASTA successfully synthesized 2/3 (#1,16,25) of the cases. For the only failing case (#25), the semantically correct transformer sets the field to be true only when two conditions are simultaneously satisfied. These two conditions were correctly identified as gadgets by PASTA, however, it failed to find them within the time limit. Therefore, our aggressive policy for pruning irrelevant gadgets should be considered proper for primitive types. Considering the resource consumption for conducting object transformation, a 30-minute time limit should be considered reasonable for production use10. Figure 7 (a) and (b) display the statistics of running time and memory consumption, respectively. Among all succeeded cases, 9/16 (56.2%) returned the first semantically correct transformer within one minute, and 14/16 (87.5%) were within 10 minutes. Both the search algorithm and test validation can also be parallelized to further accelerate the implementation. However, these engineering issues are not the major focus of this paper. In terms of memory, PASTA used less than 4 GB memory for 22/26 (84.6%) of the subjects. Subject #20 consumed the most memory (∼6 GB) on time out due to its large number of gadgets (124,569). The overall results can be considered acceptable for production use. Finally, to our surprise, our search algorithm even found a simpler field transformer for field ka of the motivating example in Figure 1 (Subject #21): 1 class DSUHelper { 2 static void transform(SocketProcessor? obj, SocketProcessor stale) { 3 obj.ka = null; 4 if (stale.socket != null) 5 obj.ka = stale.socket.getAttachment(false); 6 } 7 } The gadget 1.getAttachment(2) in Line 5 was originally used for creating an NioChannel object’s corresponding KeyAttachment, which was passed to by a true argument (thus was considered irrelevant in our manual transformer construction). However, its behavior of returning an existing KeyAttachment object is perfectly correct for our expected object transformation. The transformer in Figure 1 was ranked in the 5th position, which also passed the tests within the time limit. Implications. Both the empirical study and evaluation results show that the unavailability of non-trivial object transformers should be recognized as a major obstacle that hinders the application of DSU in practice. Considering updating Tomcat 8 10Gadget extraction time is less than 5 minute for both Tomcat and FtpServer. We did not count such pre-processing time.
and FtpServer with state-of-the-art techniques before PASTA,flows of the same program during the update,and DSU roughly 18%updates (commits)still require a restart.How-is implemented by process replacement [49-51].A more ever,if all non-trivial cases can be provided with a proper lightweight approach is the intra-process strategy,where transformer,the restart rate would decrease to ~2%. dynamic software update happens in-place [15,17,18]. This paper opens a promising research direction towards Object Transformation in DSU.Both inter-and intra-process automated object transformation in DSU on code reuse.Consid- DSUs require object transformation.Object transformation can ering the distribution of all updates (commits)in our evaluation, be eager [15](all stale objects are transformed at update time) PASTA can reduce the restart rate by 60.5%,compared with the or lazy [17](object is transformed upon access).PASTA works best state-of-the-art technique.Considering the rapid advances for both cases. in the program synthesis and repair community [38-41],we Our empirical study reveals that a default transformer are optimistic that most and more non-trivial transformers can (either copies the old-version value for an unchanged field, be automatically synthesized in the near future. or assigns a default value to a newly-added field)suffices VII.THREATS TO VALIDITY for most dynamic updates.Therefore,it is not a surprise that default transformation is widely adopted in existing DSU A major threat concerns the generalization of our empirical systems [15,17,18,22,46,52].To perform non-trivial object study results because Tomcat 8 is the only investigated subject transformations,existing DSU work [15,17,22,46]suggests Since Tomcat is a mature,actively developed,and widely used subject extensively studied by existing literature [17,28,42- that object transformers should be shipped along with the patch to enable DSU in practice. 45],we should consider that the empirical study results reflect TOS 27]took the first step in the automatic synthesis of real-world software evolution to a large extent. There is a minor chance that we erroneously marked a object transformers.Given paired old/new version objects, transformer synthesis can be regarded as a programming by non-trivial object transformer as trivial in the empirical study example (PBE)problem,in which syntax-guided search is because reasoning about a program's runtime state is labor- intensive.(This is also the major reason that we did not usually adopted.However,TOS adopts a Java-alike DSL for include more evaluation subjects.)Taking such potential error specifying transformers,yielding a huge search space and into consideration,the applicability of DSU in practice may subsequent search failures.In contrast,we kept DSL constructs to a minimal extent and let extracted gadgets to perform the potentially be more challenging.On the other hand,it also suggests that object transformation should draw more serious actual object state computation. research efforts. AOTES [28]took a fundamentally different approach to Another threat to our evaluation results'validity is that the object transformation.Instead of synthesizing a transformer, experimental subjects partially overlap the ones used in the AOTES for each object synthesizes a method invocation history empirical study,and the high success rate may be due to that brings the object to its current state.The synthesized overfitting.We argue that this is not likely the case because: history is then replayed (executed)on the new version code (1)the design of PASTA follows the general principles of to obtain the transformed object.AOTES has the potential to scale out(though its current implementation frequently fails on software systems,and(2)the experiments on Apache FtpServer large systems like Tomcat).However,it is extremely difficult also show significant improvements over existing techniques. Therefore,PASTA should be recognized as useful in conducting for a developer to validate the correctness of a transformation, DSU for similarly long-running server applications,which are leaving it risky to use in practice actually the major focus of dynamic software updating. Finally,some DSU systems [53,54]entirely eliminate object transformations by restricting the update timing,e.g.,updates VIII.RELATED WORK can only be applied when there is no unsafe event [53]or object Dynamic Software Updates.To dynamically update a running state [54].Existing object transformation techniques(PASTA, system,one must determine: TOS,and AOTES)are orthogonal to these systems.Generally What to update,i.e.,specifying the parts of the system to speaking,better transformers allow more update timings,and be dynamically upgraded,e.g.,via a source code patch.This fewer update timings tolerate simpler transformers. paper's focus is the DSU of long-running Java server programs Program Synthesis.Object transformer is a piece of program. over maintenance upgrades. Thus,PASTA belongs to the family of program synthesis [55- When to update,i.e.,monitoring the system execution until an 58].TOS is inspired by the PBE approaches [59,60],which update-safe point is reached [10,31].Existing strategies include focus on spreadsheet data transformations.Foofah [61,62] activeness safety [15,17,46,47],con-freeness safety [47],and adopted PBE to synthesize data transformation programs for transactions version consistency [48].This paper assumes the data analysis tasks.PASTA aims at generating non-trivial object most popular activeness safety criterion. transformers,which is a more challenging task. How to update,i.e.,applying the dynamic patch [27,28].Con- The design of placeholders in PASTA originates from ducting the update includes replacing the changed code [32,33]program sketching [63-65]in which placeholders are filled and updating the stale objects [15,46].Inter-process state with synthesized code snippets.However,program synthesis transformation (multi-versioning)maintains multiple execution for PASTA is quite different in nature compared with program
and FtpServer with state-of-the-art techniques before PASTA, roughly 18% updates (commits) still require a restart. However, if all non-trivial cases can be provided with a proper transformer, the restart rate would decrease to ∼2%. This paper opens a promising research direction towards automated object transformation in DSU on code reuse. Considering the distribution of all updates (commits) in our evaluation, PASTA can reduce the restart rate by 60.5%, compared with the best state-of-the-art technique. Considering the rapid advances in the program synthesis and repair community [38–41], we are optimistic that most and more non-trivial transformers can be automatically synthesized in the near future. VII. THREATS TO VALIDITY A major threat concerns the generalization of our empirical study results because Tomcat 8 is the only investigated subject. Since Tomcat is a mature, actively developed, and widely used subject extensively studied by existing literature [17, 28, 42– 45], we should consider that the empirical study results reflect real-world software evolution to a large extent. There is a minor chance that we erroneously marked a non-trivial object transformer as trivial in the empirical study because reasoning about a program’s runtime state is laborintensive. (This is also the major reason that we did not include more evaluation subjects.) Taking such potential error into consideration, the applicability of DSU in practice may potentially be more challenging. On the other hand, it also suggests that object transformation should draw more serious research efforts. Another threat to our evaluation results’ validity is that the experimental subjects partially overlap the ones used in the empirical study, and the high success rate may be due to overfitting. We argue that this is not likely the case because: (1) the design of PASTA follows the general principles of software systems, and (2) the experiments on Apache FtpServer also show significant improvements over existing techniques. Therefore, PASTA should be recognized as useful in conducting DSU for similarly long-running server applications, which are actually the major focus of dynamic software updating. VIII. RELATED WORK Dynamic Software Updates. To dynamically update a running system, one must determine: What to update, i.e., specifying the parts of the system to be dynamically upgraded, e.g., via a source code patch. This paper’s focus is the DSU of long-running Java server programs over maintenance upgrades. When to update, i.e., monitoring the system execution until an update-safe point is reached [10, 31]. Existing strategies include activeness safety [15, 17, 46, 47], con-freeness safety [47], and transactions version consistency [48]. This paper assumes the most popular activeness safety criterion. How to update, i.e., applying the dynamic patch [27, 28]. Conducting the update includes replacing the changed code [32, 33] and updating the stale objects [15, 46]. Inter-process state transformation (multi-versioning) maintains multiple execution flows of the same program during the update, and DSU is implemented by process replacement [49–51]. A more lightweight approach is the intra-process strategy, where dynamic software update happens in-place [15, 17, 18]. Object Transformation in DSU. Both inter- and intra-process DSUs require object transformation. Object transformation can be eager [15] (all stale objects are transformed at update time) or lazy [17] (object is transformed upon access). PASTA works for both cases. Our empirical study reveals that a default transformer (either copies the old-version value for an unchanged field, or assigns a default value to a newly-added field) suffices for most dynamic updates. Therefore, it is not a surprise that default transformation is widely adopted in existing DSU systems [15, 17, 18, 22, 46, 52]. To perform non-trivial object transformations, existing DSU work [15, 17, 22, 46] suggests that object transformers should be shipped along with the patch to enable DSU in practice. TOS [27] took the first step in the automatic synthesis of object transformers. Given paired old/new version objects, transformer synthesis can be regarded as a programming by example (PBE) problem, in which syntax-guided search is usually adopted. However, TOS adopts a Java-alike DSL for specifying transformers, yielding a huge search space and subsequent search failures. In contrast, we kept DSL constructs to a minimal extent and let extracted gadgets to perform the actual object state computation. AOTES [28] took a fundamentally different approach to object transformation. Instead of synthesizing a transformer, AOTES for each object synthesizes a method invocation history that brings the object to its current state. The synthesized history is then replayed (executed) on the new version code to obtain the transformed object. AOTES has the potential to scale out (though its current implementation frequently fails on large systems like Tomcat). However, it is extremely difficult for a developer to validate the correctness of a transformation, leaving it risky to use in practice. Finally, some DSU systems [53, 54] entirely eliminate object transformations by restricting the update timing, e.g., updates can only be applied when there is no unsafe event [53] or object state [54]. Existing object transformation techniques (PASTA, TOS, and AOTES) are orthogonal to these systems. Generally speaking, better transformers allow more update timings, and fewer update timings tolerate simpler transformers. Program Synthesis. Object transformer is a piece of program. Thus, PASTA belongs to the family of program synthesis [55– 58]. TOS is inspired by the PBE approaches [59, 60], which focus on spreadsheet data transformations. Foofah [61, 62] adopted PBE to synthesize data transformation programs for data analysis tasks. PASTA aims at generating non-trivial object transformers, which is a more challenging task. The design of placeholders in PASTA originates from program sketching [63–65] in which placeholders are filled with synthesized code snippets. However, program synthesis for PASTA is quite different in nature compared with program