Rely-Guarantee-Based Simulation for Compositional Verification of Concurrent Program Transformations HONGJIN LIANG,XINYU FENG and MING FU,University of Science and Technology of China Verifying program transformations usually requires proving that the resulting program(the target)refines or is equivalent to the original one(the source).However,the refinement relation between individual sequen- tial threads cannot be preserved in general with the presence of parallel compositions,due to instruction reordering and the different granularities of atomic operations at the source and the target.On the other hand.the refinement relation defined based on fully abstract semantics of concurrent programs assumes arbitrary parallel environments,which is too strong and cannot be satisfied by many well-known transfor. mations. In this paper,we propose a Rely-Guarantee-based Simulation(RGSim)to verify concurrent program transformations.The relation is parametrized with constraints of the environments that the source and the target programs may compose with.It considers the interference between threads and their environments thus is less permissive than relations over sequential programs.It is compositional w.r.t.parallel composi- tions as long as the constraints are satisfied.Also,RGSim does not require semantics preservation under all environments,and can incorporate the assumptions about environments made by specific program transfor. mations in the form of rely/guarantee conditions.We use RGSim to reason about optimizations and prove atomicity of concurrent objects.We also propose a general garbage collector verification framework based on RGSim,and verify the Boehm et al.concurrent mark-sweep GC. Categories and Subject Descriptors:D.2.4[Software Engineeringl:Software/Program Verification-Cor- rectness proofs,Formal methods;F.3.1 [Logics and Meanings of Programs):Specifying and Verifying and Reasoning about Programs General Terms:Theory,Verification Additional Key Words and Phrases:Concurrency,Program Transformation,Rely-Guarantee Reasoning, Simulation ACM Reference Format: Liang,H.,Feng,X.,and Fu,M.2014.Rely-Guarantee-Based Simulation for Compositional Verification of Concurrent Program Transformations.ACM Trans.Program.Lang.Syst.V,N,Article A (January YYYY), 53 pages. D0I:http/dx.doi.org10.1145/0000000.0000000 This work is supported in part by grants from National Natural Science Foundation of China(NSFC)under Grant Nos.61379039,61229201,61103023 and 91318301,Program for New Century Excellent Talents in Universities(Grant No.NCET-2010-0984).and the Fundamental Research Funds for the Central Universi- ties(Grant No.WK0110000031). Author's addresses:H.Liang,X.Feng and M.Fu,School of Computer Science and Technology,University of Science and Technology of China,Hefei,Anhui 230026,China;Suzhou Institute for Advanced Study, University of Science and Technology of China,Suzhou,Jiangsu 215123,China. Emails:H.Liang,Ihj1018@mail.ustc.edu.cn;X.Feng (corresponding author),xyfeng@ustc.edu.cn;and M.Fu,fuming@ustc.edu.cn Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies show this notice on the first page or initial screen of a display along with the full citation.Copyrights for components of this work owned by others than ACM must be honored.Abstracting with credit is per- mitted.To copy otherwise,to republish,to post on servers,to redistribute to lists,or to use any component of this work in other works requires prior specific permission and/or a fee.Permissions may be requested from Publications Dept.,ACM,Inc.,2 Penn Plaza,Suite 701,New York,NY 10121-0701 USA,fax +1(212) 869-0481,or permissions@acm.org YYYY ACM 0164-0925/YYYY/01-ARTA $15.00 D0I:http/dk.doi.org10.1145/0000000.0000000 ACM Transactions on Programming Languages and Systems,Vol.V,No.N,Article A,Publication date:January YYYY
A Rely-Guarantee-Based Simulation for Compositional Verification of Concurrent Program Transformations HONGJIN LIANG, XINYU FENG and MING FU, University of Science and Technology of China Verifying program transformations usually requires proving that the resulting program (the target) refines or is equivalent to the original one (the source). However, the refinement relation between individual sequential threads cannot be preserved in general with the presence of parallel compositions, due to instruction reordering and the different granularities of atomic operations at the source and the target. On the other hand, the refinement relation defined based on fully abstract semantics of concurrent programs assumes arbitrary parallel environments, which is too strong and cannot be satisfied by many well-known transformations. In this paper, we propose a Rely-Guarantee-based Simulation (RGSim) to verify concurrent program transformations. The relation is parametrized with constraints of the environments that the source and the target programs may compose with. It considers the interference between threads and their environments, thus is less permissive than relations over sequential programs. It is compositional w.r.t. parallel compositions as long as the constraints are satisfied. Also, RGSim does not require semantics preservation under all environments, and can incorporate the assumptions about environments made by specific program transformations in the form of rely/guarantee conditions. We use RGSim to reason about optimizations and prove atomicity of concurrent objects. We also propose a general garbage collector verification framework based on RGSim, and verify the Boehm et al. concurrent mark-sweep GC. Categories and Subject Descriptors: D.2.4 [Software Engineering]: Software/Program Verification – Correctness proofs, Formal methods; F.3.1 [Logics and Meanings of Programs]: Specifying and Verifying and Reasoning about Programs General Terms: Theory, Verification Additional Key Words and Phrases: Concurrency, Program Transformation, Rely-Guarantee Reasoning, Simulation ACM Reference Format: Liang, H., Feng, X., and Fu, M. 2014. Rely-Guarantee-Based Simulation for Compositional Verification of Concurrent Program Transformations. ACM Trans. Program. Lang. Syst. V, N, Article A (January YYYY), 53 pages. DOI:http://dx.doi.org/10.1145/0000000.0000000 This work is supported in part by grants from National Natural Science Foundation of China (NSFC) under Grant Nos. 61379039, 61229201, 61103023 and 91318301, Program for New Century Excellent Talents in Universities (Grant No. NCET-2010-0984), and the Fundamental Research Funds for the Central Universities (Grant No. WK0110000031). Author’s addresses: H. Liang, X. Feng and M. Fu, School of Computer Science and Technology, University of Science and Technology of China, Hefei, Anhui 230026, China; Suzhou Institute for Advanced Study, University of Science and Technology of China, Suzhou, Jiangsu 215123, China. Emails: H. Liang, lhj1018@mail.ustc.edu.cn; X. Feng (corresponding author), xyfeng@ustc.edu.cn; and M. Fu, fuming@ustc.edu.cn. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies show this notice on the first page or initial screen of a display along with the full citation. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, to redistribute to lists, or to use any component of this work in other works requires prior specific permission and/or a fee. Permissions may be requested from Publications Dept., ACM, Inc., 2 Penn Plaza, Suite 701, New York, NY 10121-0701 USA, fax +1 (212) 869-0481, or permissions@acm.org. ⃝c YYYY ACM 0164-0925/YYYY/01-ARTA $15.00 DOI:http://dx.doi.org/10.1145/0000000.0000000 ACM Transactions on Programming Languages and Systems, Vol. V, No. N, Article A, Publication date: January YYYY
A:2 Hongjin Liang et al. 1.INTRODUCTION Many verification problems can be reduced to verifying program transformations,i.e., proving the target program of the transformation has no more observable behaviors than the source.Below we give some typical examples in concurrent settings: -Correctness of compilation and optimizations of concurrent programs.In this most natural program transformation verification problem,every compilation phase does a program transformation T,which needs to preserve the semantics of the inputs. Atomicity of concurrent objects.A concurrent object or library provides a set of meth- ods that allow clients to manipulate the shared data structure with abstract atomic behaviors [Herlihy and Shavit 20081.Their correctness can be reduced to the cor- rectness of the transformation from abstract atomic operations to concrete and exe- cutable programs in a concurrent context. Verifying implementations of software transactional memory(STM).Many languages supporting STM provide a high-level atomic block atomic{C),so that programmers can assume the atomicity of the execution of C.Atomic blocks are implemented using some STM protocol (e.g.,TL2 [Dice et al.2006])that allows very fine-grained inter- leavings.Verifying that the fine-grained program respects the semantics of atomic blocks gives us the correctness of the STM implementation. Correctness of concurrent garbage collectors(GCs).High-level garbage-collected lan- guages (e.g.,Java)allow programmers to work at an abstract level without knowledge of the underlying GC algorithm.However,the concrete and executable low-level pro- gram involves interactions between the mutators and the collector.If we view the GC implementation as a transformation from high-level mutators to low-level ones with a concrete GC thread,the GC safety can be reduced naturally to the semantics preservation of the transformation. To verify the correctness of a program transformation T,we follow Leroy's ap- proach [Leroy 2009]and define a refinement relation between the target and the source programs,which says the target has no more observable behaviors than the source.Then we can formalize the correctness of the transformation as follows: Correct(T)≌C,C.C=T(C)→CgC. (1.1) That is,for any source program C acceptable by T,T(C)is a refinement of C.When the source and the target are shared-state concurrent programs,the refinement needs to satisfy the following requirements to support effective proof of Correct(T): -Since the target T(C)may be in a different language from the source,the refinement should be general and independent of the language details. -To verify fine-grained implementations of abstract operations,the refinement should support different views of program states and different granularities of state accesses at the source and the target levels. .When T is syntax-directed (and it is usually the case for parallel compositions,i.e., T(CC)=T(C)T(C)),a compositional refinement is of particular importance for modular verification of T. However,existing refinement (or equivalence)relations cannot satisfy all these re- quirements at the same time.Contextual equivalence,the canonical notion for com- paring program behaviors,fails to handle different languages since the contexts of the source and the target will be different.Simulations and logical relations have been used to verify compilation [Leroy 2009;Benton and Hur 2009;Lochbihler 2010; Hur and Dreyer 2011],but they are usually designed for sequential programs (ex- cept [Lochbihler 2010;Sevcik et al.2011],which we will discuss in Section 8).Since the ACM Transactions on Programming Languages and Systems,Vol.V,No.N,Article A,Publication date:January YYYY
A:2 Hongjin Liang et al. 1. INTRODUCTION Many verification problems can be reduced to verifying program transformations, i.e., proving the target program of the transformation has no more observable behaviors than the source. Below we give some typical examples in concurrent settings: — Correctness of compilation and optimizations of concurrent programs. In this most natural program transformation verification problem, every compilation phase does a program transformation T, which needs to preserve the semantics of the inputs. — Atomicity of concurrent objects. A concurrent object or library provides a set of methods that allow clients to manipulate the shared data structure with abstract atomic behaviors [Herlihy and Shavit 2008]. Their correctness can be reduced to the correctness of the transformation from abstract atomic operations to concrete and executable programs in a concurrent context. — Verifying implementations of software transactional memory (STM). Many languages supporting STM provide a high-level atomic block atomic{C}, so that programmers can assume the atomicity of the execution of C. Atomic blocks are implemented using some STM protocol (e.g., TL2 [Dice et al. 2006]) that allows very fine-grained interleavings. Verifying that the fine-grained program respects the semantics of atomic blocks gives us the correctness of the STM implementation. — Correctness of concurrent garbage collectors (GCs). High-level garbage-collected languages (e.g., Java) allow programmers to work at an abstract level without knowledge of the underlying GC algorithm. However, the concrete and executable low-level program involves interactions between the mutators and the collector. If we view the GC implementation as a transformation from high-level mutators to low-level ones with a concrete GC thread, the GC safety can be reduced naturally to the semantics preservation of the transformation. To verify the correctness of a program transformation T, we follow Leroy’s approach [Leroy 2009] and define a refinement relation ⊑ between the target and the source programs, which says the target has no more observable behaviors than the source. Then we can formalize the correctness of the transformation as follows: Correct(T) , ∀C, C. C = T(C) =⇒ C ⊑ C . (1.1) That is, for any source program C acceptable by T, T(C) is a refinement of C. When the source and the target are shared-state concurrent programs, the refinement ⊑ needs to satisfy the following requirements to support effective proof of Correct(T): — Since the target T(C) may be in a different language from the source, the refinement should be general and independent of the language details. — To verify fine-grained implementations of abstract operations, the refinement should support different views of program states and different granularities of state accesses at the source and the target levels. — When T is syntax-directed (and it is usually the case for parallel compositions, i.e., T(C∥C ′ ) = T(C)∥T(C ′ )), a compositional refinement is of particular importance for modular verification of T. However, existing refinement (or equivalence) relations cannot satisfy all these requirements at the same time. Contextual equivalence, the canonical notion for comparing program behaviors, fails to handle different languages since the contexts of the source and the target will be different. Simulations and logical relations have been used to verify compilation [Leroy 2009; Benton and Hur 2009; Lochbihler 2010; Hur and Dreyer 2011], but they are usually designed for sequential programs (except [Lochbihler 2010; Sev ˇ cˇ´ık et al. 2011], which we will discuss in Section 8). Since the ACM Transactions on Programming Languages and Systems, Vol. V, No. N, Article A, Publication date: January YYYY
Rely-Guarantee-Based Simulation A:3 refinement or equivalence relation between sequential threads cannot be preserved in general with parallel compositions,we cannot simply adapt existing work on sequen- tial programs to verify transformations of concurrent programs.Refinement relations based on fully abstract semantics of concurrent programs are compositional,but they assume arbitrary program contexts,which is too strong for many practical transfor- mations.We will explain the challenges in detail in Section 2. In this paper,we propose a Rely-Guarantee-based Simulation(RGSim)for composi- tional verification of concurrent transformations.By addressing the above problems. we make the following contributions: -RGSim parametrizes the simulation between concurrent programs with rely/guarantee conditions [Jones 1983],which specify the interactions between the programs and their environments.This makes the corresponding refinement relation compositional w.rt.parallel compositions,allowing us to decompose refine- ment proofs for multi-threaded programs into proofs for individual threads.On the other hand,the rely/guarantee conditions can incorporate the assumptions about environments made by specific program transformations,so RGSim can be applied to verify many practical transformations Based on the simulation technique,RGSim focuses on comparing externally observ- able behaviors (e.g.,I/O events)only,which gives us considerable leeway in the imple- mentations of related programs.The relation is mostly independent of the language details.It can be used to relate programs in different languages with different views of program states and different granularities of atomic state accesses. -RGSim makes relational reasoning about optimizations possible in parallel contexts. We present a set of relational reasoning rules to characterize and justify common optimizations in a concurrent setting,including hoisting loop invariants,strength reduction and induction variable elimination,dead code elimination,redundancy in- troduction,etc. RGSim gives us a refinement-based proof method to verify fine-grained implemen- tations of abstract algorithms and concurrent objects.We successfully apply RGSim to verify concurrent counters,the concurrent GCD algorithm,Treiber's non-blocking stack and the lock-coupling list. We reduce the problem of verifying concurrent garbage collectors to verifying trans- formations,and present a general GC verification framework,which combines unary Rely-Guarantee-based verification [Jones 1983]with relational proofs based on RGSim. We verify the Boehm et al.concurrent garbage collection algorithm [Boehm et al. 1991]using our framework.As far as we know,it is the first time to formally prove the correctness of this algorithm. We give a mechanized formulation of RGSim,and prove its soundness and composi- tionality in the Coq proof assistant [2010].Both the manual and mechanized proofs are available online' This paper extends the conference paper in POPL 2012 [Liang et al.2012].First,we add more examples,including strength reduction and induction variable elimination, the non-blocking concurrent counter,Treiber's stack algorithm,and the concurrent GCD algorithm.Second,we significantly expand the details for the concurrent GC verification,demonstrating that RGSim is a powerful proof technique for verifying program transformations which involve concurrent run-time systems. In the rest of this paper,we first analyze the challenges for compositional verifica- tion of concurrent program transformations,and explain our approach informally in ihttp://kyhcs.ustcsz.edu.cn/relconcur/rgsim ACM Transactions on Programming Languages and Systems,Vol.V,No.N,Article A,Publication date:January YYYY
Rely-Guarantee-Based Simulation A:3 refinement or equivalence relation between sequential threads cannot be preserved in general with parallel compositions, we cannot simply adapt existing work on sequential programs to verify transformations of concurrent programs. Refinement relations based on fully abstract semantics of concurrent programs are compositional, but they assume arbitrary program contexts, which is too strong for many practical transformations. We will explain the challenges in detail in Section 2. In this paper, we propose a Rely-Guarantee-based Simulation (RGSim) for compositional verification of concurrent transformations. By addressing the above problems, we make the following contributions: — RGSim parametrizes the simulation between concurrent programs with rely/guarantee conditions [Jones 1983], which specify the interactions between the programs and their environments. This makes the corresponding refinement relation compositional w.r.t. parallel compositions, allowing us to decompose refinement proofs for multi-threaded programs into proofs for individual threads. On the other hand, the rely/guarantee conditions can incorporate the assumptions about environments made by specific program transformations, so RGSim can be applied to verify many practical transformations. — Based on the simulation technique, RGSim focuses on comparing externally observable behaviors (e.g., I/O events) only, which gives us considerable leeway in the implementations of related programs. The relation is mostly independent of the language details. It can be used to relate programs in different languages with different views of program states and different granularities of atomic state accesses. — RGSim makes relational reasoning about optimizations possible in parallel contexts. We present a set of relational reasoning rules to characterize and justify common optimizations in a concurrent setting, including hoisting loop invariants, strength reduction and induction variable elimination, dead code elimination, redundancy introduction, etc. — RGSim gives us a refinement-based proof method to verify fine-grained implementations of abstract algorithms and concurrent objects. We successfully apply RGSim to verify concurrent counters, the concurrent GCD algorithm, Treiber’s non-blocking stack and the lock-coupling list. — We reduce the problem of verifying concurrent garbage collectors to verifying transformations, and present a general GC verification framework, which combines unary Rely-Guarantee-based verification [Jones 1983] with relational proofs based on RGSim. — We verify the Boehm et al. concurrent garbage collection algorithm [Boehm et al. 1991] using our framework. As far as we know, it is the first time to formally prove the correctness of this algorithm. — We give a mechanized formulation of RGSim, and prove its soundness and compositionality in the Coq proof assistant [2010]. Both the manual and mechanized proofs are available online1 . This paper extends the conference paper in POPL 2012 [Liang et al. 2012]. First, we add more examples, including strength reduction and induction variable elimination, the non-blocking concurrent counter, Treiber’s stack algorithm, and the concurrent GCD algorithm. Second, we significantly expand the details for the concurrent GC verification, demonstrating that RGSim is a powerful proof technique for verifying program transformations which involve concurrent run-time systems. In the rest of this paper, we first analyze the challenges for compositional verification of concurrent program transformations, and explain our approach informally in 1http://kyhcs.ustcsz.edu.cn/relconcur/rgsim ACM Transactions on Programming Languages and Systems, Vol. V, No. N, Article A, Publication date: January YYYY
A:4 Hongjin Liang et al. local ri; local r2; x:=1; y:=1; r1:=y; ‖r2:=x; if (r1 =0)then if (r2 =0)then critical region critical region (a)Dekker's Mutual Exclusion Algorithm x:=x+1;‖x:=x+1: US. local ri; local r2; r1:=x; ‖2:=x; x:=r1+1; X:=r2+1; (b)Different Granularities of Atomic Operations Fig.1.Equivalence Lost after Parallel Composition Section 2.Then we give the basic technical settings in Section 3 and present the formal definition of RGSim in Section 4.We show the use of RGSim to reason about optimiza- tions in Section 5,verify fine-grained algorithms and atomicity of concurrent objects in Section 6,and prove the correctness of concurrent GCs in Section 7.Finally we discuss related work and conclude in Section 8. 2.CHALLENGES AND OUR APPROACH The major challenge we face is to have a compositional refinement relation between concurrent programs,i.e.,we should be able to know T(C1)IT(C2)C1ll C2 if we have T(C1)C1 and T(C2)C2. 2.1.Sequential Refinement Loses Parallel Compositionality Observable behaviors of sequential imperative programs usually refer to their con- trol effects (e.g.,termination and exceptions)and final program states.However,re- finement relations defined correspondingly cannot be preserved after parallel compo- sitions.It has been a well-known fact in the compiler community that sound opti- mizations for sequential programs may change the behaviors of multi-threaded pro- grams [Boehm 2005].The Dekker's algorithm shown in Figure 1(a)has been widely used to demonstrate the problem.Reordering the first two assignment statements of the thread on the left preserves its sequential behaviors,but the whole program can no longer ensure exclusive access to the critical region. In addition to instruction reordering,the different granularities of atomic operations between the source and the target programs can also break the compositionality of program equivalence in a concurrent setting.In Figure 1(b),the target program at the bottom behaves differently from the source at the top (assuming each statement is executed atomically),although the individual threads at the target and the source have the same behaviors. 2.2.Assuming Arbitrary Environments is Too Strong The problem with the refinement for sequential programs is that it does not consider the effects of threads'intermediate state accesses on their parallel environments.Peo- ple have given fully abstract semantics to concurrent programs (e.g.,[Brookes 1996; Abadi and Plotkin 2009]).The semantics of a program is modeled as a set of execu- ACM Transactions on Programming Languages and Systems,Vol.V,No.N,Article A,Publication date:January YYYY
A:4 Hongjin Liang et al. local r1; x := 1; r1 := y; if (r1 = 0) then critical region ∥ local r2; y := 1; r2 := x; if (r2 = 0) then critical region (a) Dekker’s Mutual Exclusion Algorithm x := x+1; ∥ x := x+1; vs. local r1; r1 := x; x := r1 + 1; ∥ local r2; r2 := x; x := r2 + 1; (b) Different Granularities of Atomic Operations Fig. 1. Equivalence Lost after Parallel Composition Section 2. Then we give the basic technical settings in Section 3 and present the formal definition of RGSim in Section 4. We show the use of RGSim to reason about optimizations in Section 5, verify fine-grained algorithms and atomicity of concurrent objects in Section 6, and prove the correctness of concurrent GCs in Section 7. Finally we discuss related work and conclude in Section 8. 2. CHALLENGES AND OUR APPROACH The major challenge we face is to have a compositional refinement relation ⊑ between concurrent programs, i.e., we should be able to know T(C1) ∥ T(C2) ⊑ C1 ∥ C2 if we have T(C1) ⊑ C1 and T(C2) ⊑ C2. 2.1. Sequential Refinement Loses Parallel Compositionality Observable behaviors of sequential imperative programs usually refer to their control effects (e.g., termination and exceptions) and final program states. However, re- finement relations defined correspondingly cannot be preserved after parallel compositions. It has been a well-known fact in the compiler community that sound optimizations for sequential programs may change the behaviors of multi-threaded programs [Boehm 2005]. The Dekker’s algorithm shown in Figure 1(a) has been widely used to demonstrate the problem. Reordering the first two assignment statements of the thread on the left preserves its sequential behaviors, but the whole program can no longer ensure exclusive access to the critical region. In addition to instruction reordering, the different granularities of atomic operations between the source and the target programs can also break the compositionality of program equivalence in a concurrent setting. In Figure 1(b), the target program at the bottom behaves differently from the source at the top (assuming each statement is executed atomically), although the individual threads at the target and the source have the same behaviors. 2.2. Assuming Arbitrary Environments is Too Strong The problem with the refinement for sequential programs is that it does not consider the effects of threads’ intermediate state accesses on their parallel environments. People have given fully abstract semantics to concurrent programs (e.g., [Brookes 1996; Abadi and Plotkin 2009]). The semantics of a program is modeled as a set of execuACM Transactions on Programming Languages and Systems, Vol. V, No. N, Article A, Publication date: January YYYY
Rely-Guarantee-Based Simulation A:5 tion traces.Each trace is an interleaving of state transitions made by the program itself and arbitrary transitions made by the environment.Then the refinement be- tween programs can be defined as the subset relation between the corresponding trace sets.Since it considers all possible environments,the refinement relation has very nice compositionality,but unfortunately is too strong to formulate the correctness of many well-known transformations,including the four classes of transformations mentioned before: -Many concurrent languages(e.g.,C++[Boehm and Adve 20081)do not give seman- tics to programs with data races (like the examples shown in Figure 1).Therefore the compilers only need to guarantee the semantics preservation of data-race-free programs. When we prove that a fine-grained implementation of a concurrent object is a re- finement of an abstract atomic object,we can assume that all accesses to the object are made through the object's methods only,e.g.,a stack object can only be accessed through push and pop methods,and its internal data cannot be arbitrarily updated. Usually the implementation of STM (e.g.,TL2 [Dice et al.2006])ensures the atom- icity of a transaction atomicfC}only when there are no data races.Therefore,the correctness of the transformation from high-level atomic blocks to fine-grained con- current code assumes data-race-freedom in the source. Many garbage-collected languages are type-safe and prohibit operations such as pointer arithmetic.Therefore the garbage collector could make corresponding as- sumptions about the mutators that run in parallel. In all these cases,the transformations of individual threads are allowed to make various assumptions about the environments.They do not have to ensure semantics preservation within all contexts. 2.3.Languages at Source and Target May Be Different The use of different languages at the source and the target levels makes the formu- lation of the transformation correctness more difficult.If the source and the target languages have different views of program states and different atomic primitives,we cannot directly compare the state transitions made by the source and the target pro- grams.This is another reason that makes the aforementioned subset relation between sets of program traces in fully abstract semantics infeasible.For the same reason, many existing techniques for proving refinement or equivalence of programs in the same language cannot be applied either. 2.4.Different Observers Make Different Observations Concurrency introduces tensions between two kinds of observers:human beings(as external observers)and the parallel program contexts.External observers do not care about the implementation details of the source and the target programs.For them, intermediate state accesses(such as memory reads and writes)are silent steps (un- observable),and only external events (such as I/O operations)are observable.On the other hand,state accesses have effects on the parallel program contexts,and are not silent to them. If the refinement relation relates externally observable event traces only,it cannot have parallel compositionality,as we explained in Section 2.1.On the other hand,re- lating all state accesses of programs is too strong.Any reordering of state accesses or change of atomicity would fail the refinement. ACM Transactions on Programming Languages and Systems,Vol.V,No.N,Article A,Publication date:January YYYY
Rely-Guarantee-Based Simulation A:5 tion traces. Each trace is an interleaving of state transitions made by the program itself and arbitrary transitions made by the environment. Then the refinement between programs can be defined as the subset relation between the corresponding trace sets. Since it considers all possible environments, the refinement relation has very nice compositionality, but unfortunately is too strong to formulate the correctness of many well-known transformations, including the four classes of transformations mentioned before: — Many concurrent languages (e.g., C++ [Boehm and Adve 2008]) do not give semantics to programs with data races (like the examples shown in Figure 1). Therefore the compilers only need to guarantee the semantics preservation of data-race-free programs. — When we prove that a fine-grained implementation of a concurrent object is a re- finement of an abstract atomic object, we can assume that all accesses to the object are made through the object’s methods only, e.g., a stack object can only be accessed through push and pop methods, and its internal data cannot be arbitrarily updated. — Usually the implementation of STM (e.g., TL2 [Dice et al. 2006]) ensures the atomicity of a transaction atomic{C} only when there are no data races. Therefore, the correctness of the transformation from high-level atomic blocks to fine-grained concurrent code assumes data-race-freedom in the source. — Many garbage-collected languages are type-safe and prohibit operations such as pointer arithmetic. Therefore the garbage collector could make corresponding assumptions about the mutators that run in parallel. In all these cases, the transformations of individual threads are allowed to make various assumptions about the environments. They do not have to ensure semantics preservation within all contexts. 2.3. Languages at Source and Target May Be Different The use of different languages at the source and the target levels makes the formulation of the transformation correctness more difficult. If the source and the target languages have different views of program states and different atomic primitives, we cannot directly compare the state transitions made by the source and the target programs. This is another reason that makes the aforementioned subset relation between sets of program traces in fully abstract semantics infeasible. For the same reason, many existing techniques for proving refinement or equivalence of programs in the same language cannot be applied either. 2.4. Different Observers Make Different Observations Concurrency introduces tensions between two kinds of observers: human beings (as external observers) and the parallel program contexts. External observers do not care about the implementation details of the source and the target programs. For them, intermediate state accesses (such as memory reads and writes) are silent steps (unobservable), and only external events (such as I/O operations) are observable. On the other hand, state accesses have effects on the parallel program contexts, and are not silent to them. If the refinement relation relates externally observable event traces only, it cannot have parallel compositionality, as we explained in Section 2.1. On the other hand, relating all state accesses of programs is too strong. Any reordering of state accesses or change of atomicity would fail the refinement. ACM Transactions on Programming Languages and Systems, Vol. V, No. N, Article A, Publication date: January YYYY
A:6 Hongjin Liang et al. 2.5.Our Approach In this paper we propose a Rely-Guarantee-based Simulation(RGSim)<between the target and the source programs.It establishes a weak simulation,ensuring that for every externally observable event made by the target program there is a corresponding one in the source.We choose to view intermediate state accesses as silent steps,thus we can relate programs with different implementation details.This also makes our simulation independent of language details. To support parallel compositionality,our relation takes into account explicitly the expected interference between threads and their parallel environments.Inspired by the Rely-Guarantee(R-G)verification method [Jones 1983],we specify the interference using rely/guarantee conditions.In Rely-Guarantee reasoning,the rely condition R of a thread specifies the permitted state transitions that its environment may have,and its guarantee G specifies the possible transitions made by the thread itself.To ensure parallel threads can collaborate,we need to check the interference constraint,i.e.,the guarantee of each thread is permitted in the rely of every other.Then we can verify their parallel composition by separately verifying each thread,showing its behaviors under the rely condition indeed satisfy its guarantee.After parallel composition,the threads should be executed under their common environment (i.e.,the intersection of their relies)and guarantee all the possible transitions made by them (i.e.,the union of their guarantees). Parametrized with rely/guarantee conditions for the two levels,our relation (C,R,G)(C,R,G)talks about not only the target C and the source C,but also the interference R and g between C and its target-level environment,and R and G between C and its environment at the source level.Informally,(C,R,g)(C,R,G) says the executions of C under the environment R do not exhibit more observable behaviors than the executions of C under the environment R,and the state transi- tions of C and C satisfy g and G respectively.RGSim is now compositional,as long as the threads are composed with well-behaved environments only.The parallel com- positionality lemma is in the following form.If we know (C1,R1,91)(C1,R1,G1) and(C2,R2,92)(C2,R2,G2),and also the interference constraints are satisfied,i.e., g2≤R1,g1≤R2,G2∈R1 and GICR2,we could get (C1‖C2,R1nR2,G1Ug2)≤(C1‖C2,R1∩R2,G1UG2) The compositionality of RGSim gives us a proof theory for concurrent program trans- formations. Also different from fully abstract semantics for threads,which assumes arbitrary be- haviors of environments,RGSim allows us to instantiate the interference R,C,R and G differently for different assumptions about environments,therefore it can be used to verify the aforementioned four classes of transformations.For instance,if we want to prove that a transformation preserves the behaviors of data-race-free programs,we can specify the data-race-freedom in R and G.Then we are no longer concerned with the examples in Figure 1,both of which have data races. Example.Below we give an example of loop invariant hoisting to illustrate how RGSim works.The formal proofs are shown in Section 5.2.1. Target Code(C1) Source Code(C) local t; local t; t:=x+1; while(i<n){ while(i n){ t:=x+1; i:=i t; i :i +t; 2 ACM Transactions on Programming Languages and Systems,Vol.V,No.N,Article A,Publication date:January YYYY
A:6 Hongjin Liang et al. 2.5. Our Approach In this paper we propose a Rely-Guarantee-based Simulation (RGSim) ≼ between the target and the source programs. It establishes a weak simulation, ensuring that for every externally observable event made by the target program there is a corresponding one in the source. We choose to view intermediate state accesses as silent steps, thus we can relate programs with different implementation details. This also makes our simulation independent of language details. To support parallel compositionality, our relation takes into account explicitly the expected interference between threads and their parallel environments. Inspired by the Rely-Guarantee (R-G) verification method [Jones 1983], we specify the interference using rely/guarantee conditions. In Rely-Guarantee reasoning, the rely condition R of a thread specifies the permitted state transitions that its environment may have, and its guarantee G specifies the possible transitions made by the thread itself. To ensure parallel threads can collaborate, we need to check the interference constraint, i.e., the guarantee of each thread is permitted in the rely of every other. Then we can verify their parallel composition by separately verifying each thread, showing its behaviors under the rely condition indeed satisfy its guarantee. After parallel composition, the threads should be executed under their common environment (i.e., the intersection of their relies) and guarantee all the possible transitions made by them (i.e., the union of their guarantees). Parametrized with rely/guarantee conditions for the two levels, our relation (C, R, G) ≼ (C, R, G) talks about not only the target C and the source C, but also the interference R and G between C and its target-level environment, and R and G between C and its environment at the source level. Informally, (C, R, G) ≼ (C, R, G) says the executions of C under the environment R do not exhibit more observable behaviors than the executions of C under the environment R, and the state transitions of C and C satisfy G and G respectively. RGSim is now compositional, as long as the threads are composed with well-behaved environments only. The parallel compositionality lemma is in the following form. If we know (C1, R1, G1) ≼ (C1, R1, G1) and (C2, R2, G2) ≼ (C2, R2, G2), and also the interference constraints are satisfied, i.e., G2 ⊆ R1, G1 ⊆ R2, G2 ⊆ R1 and G1 ⊆ R2, we could get (C1 ∥C2, R1 ∩ R2, G1 ∪ G2) ≼ (C1 ∥C2, R1 ∩ R2, G1 ∪ G2). The compositionality of RGSim gives us a proof theory for concurrent program transformations. Also different from fully abstract semantics for threads, which assumes arbitrary behaviors of environments, RGSim allows us to instantiate the interference R, G, R and G differently for different assumptions about environments, therefore it can be used to verify the aforementioned four classes of transformations. For instance, if we want to prove that a transformation preserves the behaviors of data-race-free programs, we can specify the data-race-freedom in R and G. Then we are no longer concerned with the examples in Figure 1, both of which have data races. Example. Below we give an example of loop invariant hoisting to illustrate how RGSim works. The formal proofs are shown in Section 5.2.1. Target Code (C1) local t; t := x + 1; while(i < n) { i := i + t; } ⇐ Source Code (C) local t; while(i < n) { t := x + 1; i := i + t; } ACM Transactions on Programming Languages and Systems, Vol. V, No. N, Article A, Publication date: January YYYY
Rely-Guarantee-Based Simulation A:7 (Events)e:=.. (Labels)o::=eT (a)Events and Transition Labels (LState)a:=... (LEpr)E∈LState→IntL (LBExp)B∈LState→{true,false}i (LInstr)c LState-P((Labels x LState)Uabort}) (LStmt)C::=skip c C1;C2 if(B)C1 else C2 while(B)C ClC2 (LStep)→reP(LStmt\{skip}×LState)x Labels×(LStmt×LState)U{abort}) (b)The Low-Level Language (HState)∑:=.. (HExpr)E∈HState→Inti (HBExp) B∈HState→{true,false} (HInstr)c E HState-P((Labels x HState)U{abort}) (HStmt)C ::skip c C1:C2 if B then C1 else C2|while B do C CC2 (HStep)→H∈P(HStmt\{skip}×HState)×Labels×(HStmt×HState)U{abort}) (c)The High-Level Language Fig.2.Generic Languages at Target and Source Levels Benton [2004]has proved that the optimized code C preserves the sequential be- haviors of the source C.In a concurrent setting,this optimization is incorrect within arbitrary environments.For instance,if other threads may update x,the final values of i might be different at the two levels.In fact,this optimization works only when the environments R at both levels do not update x nor t.It requires the program- mers to mark x as a volatile variable in practice.The guarantees g of both Ci and C can be specified as arbitrary transitions.Then we can prove the RGSim relation (C1,R,g)(C,R,g)and conclude the correctness of the transformation. 3.BASIC TECHNICAL SETTINGS In this section,we present the source and the target programming languages.Then we define a basic refinement which naturally says the target has no more externally observable event traces than the source.We use as an intuitive formulation of the correctness of transformations.Our RGSim relation,which will be formally defined in Section 4,is proposed as a proof technique for E. 3.1.The Languages Following standard simulation techniques,we model the semantics of target and source programs as labeled transition systems.Before showing the languages,we first define events and labels in Figure 2(a).We leave the set of events unspecified here.It can be instantiated by program verifiers,depending on their interest (e.g.,input/output events).A label that will be associated with a state transition is either an event or r, which means the corresponding transition does not generate any event(i.e.,a silent step). ACM Transactions on Programming Languages and Systems,Vol.V,No.N,Article A,Publication date:January YYYY
Rely-Guarantee-Based Simulation A:7 (Events) e ::= . . . (Labels) o ::= e | τ (a) Events and Transition Labels (LState) σ ::= . . . (LExpr) E ∈ LState → Int⊥ (LBExp) B ∈ LState → {true,false}⊥ (LInstr) c ∈ LState ⇀ P((Labels × LState) ∪ {abort}) (LStmt) C ::= skip | c | C1; C2 | if (B) C1 else C2 | while (B) C | C1 ∥C2 (LStep) −→L ∈ P((LStmt\{skip} × LState) × Labels × ((LStmt × LState) ∪ {abort})) (b) The Low-Level Language (HState) Σ ::= . . . (HExpr) E ∈ HState → Int⊥ (HBExp) B ∈ HState → {true,false}⊥ (HInstr) c ∈ HState ⇀ P((Labels × HState) ∪ {abort}) (HStmt) C ::= skip | c | C1; ; C2 | if B then C1 else C2 | while B do C | C19C2 (HStep) −→H ∈ P((HStmt\{skip} × HState) × Labels × ((HStmt × HState) ∪ {abort})) (c) The High-Level Language Fig. 2. Generic Languages at Target and Source Levels Benton [2004] has proved that the optimized code C1 preserves the sequential behaviors of the source C. In a concurrent setting, this optimization is incorrect within arbitrary environments. For instance, if other threads may update x, the final values of i might be different at the two levels. In fact, this optimization works only when the environments R at both levels do not update x nor t. It requires the programmers to mark x as a volatile variable in practice. The guarantees G of both C1 and C can be specified as arbitrary transitions. Then we can prove the RGSim relation (C1, R, G) ≼ (C, R, G) and conclude the correctness of the transformation. 3. BASIC TECHNICAL SETTINGS In this section, we present the source and the target programming languages. Then we define a basic refinement ⊑, which naturally says the target has no more externally observable event traces than the source. We use ⊑ as an intuitive formulation of the correctness of transformations. Our RGSim relation, which will be formally defined in Section 4, is proposed as a proof technique for ⊑. 3.1. The Languages Following standard simulation techniques, we model the semantics of target and source programs as labeled transition systems. Before showing the languages, we first define events and labels in Figure 2(a). We leave the set of events unspecified here. It can be instantiated by program verifiers, depending on their interest (e.g., input/output events). A label that will be associated with a state transition is either an event or τ , which means the corresponding transition does not generate any event (i.e., a silent step). ACM Transactions on Programming Languages and Systems, Vol. V, No. N, Article A, Publication date: January YYYY
A:8 Hongjin Liang et al. The target language,which we also call the low-level language,is shown in Fig- ure 2(b).We abstract away the forms of states,expressions and primitive instructions in the language.An arithmetic expression E is modeled as a function from states to integers lifted with an undefined value L.Boolean expressions Bs are modeled simi- larly.An instruction c is a partial function from states to sets of label and state pairs, describing the state transitions and the events it generates.We use P(-)to denote the power set.Unsafe executions lead to abort.Note that the semantics of an instruction could be non-deterministic.Moreover,it might be undefined on some states,making it possible to model blocking operations such as acquiring a lock. Statements are either primitive instructions or compositions of them.skip is a spe- cial statement used as a flag to show the end of executions.When it is sequentially composed with other statements,it has no computational effects.A single-step execu- tion of statements is modeled as a labeled transition_,which is a triple of an initial program configuration (a pair of statement and state),a label and a resulting configuration.It is undefined when the initial statement is skip.The step aborts if an unsafe instruction is executed. The high-level language(source language)is defined similarly in Figure 2(c),but it is important to note that its states and primitive instructions may be different from those in the low-level language.The compound statements are almost the same as their low-level counterparts.C1;C2 and CiC2 are sequential and parallel composi- tions of Ci and C2 respectively.Note that we choose to use the same set of compound statements in the two languages for simplicity only.This is not required by our simu- lation relation,although the analogous program constructs of the two languages (e.g., parallel compositions Ci C2 and CiC2)make it convenient for us to discuss the compositionality later. (o,)∈c abort∈c ∑tdom(c) (c,)。(skp,') (c,)-abort (c,)→(c,) (C1,)。(C1,') (skipskip,)→(shp,) (CllC2,)。(C"C2,') (C2,)(C2,) (C1,)→abort or(C2,)→abort (C川lC2,)°(C1lC2,Σ') (C1川C2,)→abort Fig.3.Selected Operational Semantics Rules of the High-Level Language Figure 3 shows part of the definition of_,which gives the high-level opera- tional semantics of statements.We often omit the subscript H(or L)in_(or -and the label on top of the arrow when it is T.The semantics is mostly stan- dard.We only show the rules for primitive instructions and parallel compositions here. Note that when a primitive instruction c is blocked at state (i.e.,g dom(c)),we let the program configuration reduce to itself.For example,the instruction lock(1)would be blocked when l is not 0,making it be repeated until 1 becomes 0;whereas unlock(1) simply sets 1 to 0 at any time and would never be blocked.Primitive instructions in the high-level and low-level languages are atomic in the interleaving semantics.Below we use-->*-for zero or multiple-step transitions with no events generated,and__ for multiple-step transitions with only one event e generated. ACM Transactions on Programming Languages and Systems,Vol.V,No.N,Article A,Publication date:January YYYY
A:8 Hongjin Liang et al. The target language, which we also call the low-level language, is shown in Figure 2(b). We abstract away the forms of states, expressions and primitive instructions in the language. An arithmetic expression E is modeled as a function from states to integers lifted with an undefined value ⊥. Boolean expressions Bs are modeled similarly. An instruction c is a partial function from states to sets of label and state pairs, describing the state transitions and the events it generates. We use P( ) to denote the power set. Unsafe executions lead to abort. Note that the semantics of an instruction could be non-deterministic. Moreover, it might be undefined on some states, making it possible to model blocking operations such as acquiring a lock. Statements are either primitive instructions or compositions of them. skip is a special statement used as a flag to show the end of executions. When it is sequentially composed with other statements, it has no computational effects. A single-step execution of statements is modeled as a labeled transition −→L , which is a triple of an initial program configuration (a pair of statement and state), a label and a resulting configuration. It is undefined when the initial statement is skip. The step aborts if an unsafe instruction is executed. The high-level language (source language) is defined similarly in Figure 2(c), but it is important to note that its states and primitive instructions may be different from those in the low-level language. The compound statements are almost the same as their low-level counterparts. C1; ; C2 and C19C2 are sequential and parallel compositions of C1 and C2 respectively. Note that we choose to use the same set of compound statements in the two languages for simplicity only. This is not required by our simulation relation, although the analogous program constructs of the two languages (e.g., parallel compositions C1 ∥ C2 and C1 9C2) make it convenient for us to discuss the compositionality later. (o, Σ ′ ) ∈ c Σ (c, Σ) o −→ (skip, Σ ′ ) abort ∈ c Σ (c, Σ) −→ abort Σ ̸∈ dom(c) (c, Σ) −→ (c, Σ) (skip9skip, Σ) −→ (skip, Σ) (C1, Σ) o −→ (C ′ 1, Σ ′ ) (C19C2, Σ) o −→ (C ′ 19C2, Σ ′ ) (C2, Σ) o −→ (C ′ 2, Σ ′ ) (C19C2, Σ) o −→ (C19C ′ 2, Σ ′ ) (C1, Σ) −→ abort or (C2, Σ) −→ abort (C19C2, Σ) −→ abort Fig. 3. Selected Operational Semantics Rules of the High-Level Language Figure 3 shows part of the definition of −→H , which gives the high-level operational semantics of statements. We often omit the subscript H (or L) in −→H (or −→L ) and the label on top of the arrow when it is τ . The semantics is mostly standard. We only show the rules for primitive instructions and parallel compositions here. Note that when a primitive instruction c is blocked at state Σ (i.e., Σ ̸∈ dom(c)), we let the program configuration reduce to itself. For example, the instruction lock(l) would be blocked when l is not 0, making it be repeated until l becomes 0; whereas unlock(l) simply sets l to 0 at any time and would never be blocked. Primitive instructions in the high-level and low-level languages are atomic in the interleaving semantics. Below we use −→∗ for zero or multiple-step transitions with no events generated, and e −→∗ for multiple-step transitions with only one event e generated. ACM Transactions on Programming Languages and Systems, Vol. V, No. N, Article A, Publication date: January YYYY
Rely-Guarantee-Based Simulation A:9 3.2.The Event Trace Refinement Now we can formally define the refinement relation that relates the set of externally observable event traces generated by the target and the source programs.A trace is a sequence of events e,and may end with a termination marker done or a fault marker abort. (EutTrace)&::=e done abort e::8 Definition 3.1 (Event Trace Set).ETrSetn(C,o)represents a set of external event traces produced by C in n steps from the state o: (1)ETrSeto(C,a)e; (2)ETrSetn+1(C,σ)≌ {E|(C,o)-→(C',a)AEEETrSetn(C,a) V(C,a)(C',a')AE'EETrSetn(C',')AE=e::E' V(C,o)-abort A&=abort VC=skip AE=done}. We define ETrSet(C,o)as U ETrSetn(C,o). We overload the notation and use ETrSet(C,>)for the high-level language.Note that we treat abort as a specific behavior instead of undefined arbitrary behaviors.The choices should depend on applications.The ideas in the paper should also apply for the latter setting,though we need to change our refinement and simulation relations defined below. Then we define an event trace refinement as the subset relation between event trace sets,which is similar to Leroy's refinement property [Leroy 2009]. Definition 3.2 (Event Trace Refinement).We say (C,o)is an e-trace refinement of (C,>),i.e.,(C,)(C >)if and only if ETrSet(C,a)C ETrSet(C,E) The refinement is defined for program configurations instead of for code only because the initial states may affect the behaviors of programs.In this case,the transformation T should translate states as well as code.We overload the notation and use T()to represent the state transformation,and use Cr C for o,∑.o=T()→(C,σ)三(C,), then Correct(T)defined in formula(1.1)can be reformulated as Correct(T)≌VC,C.C=T(C)→CgrC, (3.1) From the above e-trace refinement definition,we can derive an e-trace equivalence relation by requiring both directions hold: (C,σ)≈(C,)(C,o)E(C,)A(C,)E(C,o), and use C≈rC for Vo,∑.o=T()→(C,o)≈(C,). 4.THE RGSIM RELATION The e-trace refinement is defined directly over the externally observable behaviors of programs.It is intuitive,and also abstract in that it is independent of language details.However,as we explained before,it is not compositional w.r.t.parallel composi- tions.In this section we propose RGSim,which can be viewed as a compositional proof technique that allows us to derive the simple e-trace refinement and then verify the corresponding transformation T. ACM Transactions on Programming Languages and Systems,Vol.V,No.N,Article A,Publication date:January YYYY
Rely-Guarantee-Based Simulation A:9 3.2. The Event Trace Refinement Now we can formally define the refinement relation ⊑ that relates the set of externally observable event traces generated by the target and the source programs. A trace is a sequence of events e, and may end with a termination marker done or a fault marker abort. (EvtTrace) E ::= ϵ | done | abort | e::E Definition 3.1 (Event Trace Set). ETrSetn(C, σ) represents a set of external event traces produced by C in n steps from the state σ: (1) ETrSet0(C, σ) , {ϵ} ; (2) ETrSetn+1(C, σ) , {E | (C, σ) −→ (C ′ , σ′ ) ∧ E ∈ETrSetn(C ′ , σ′ ) ∨ (C, σ) e −→ (C ′ , σ′ ) ∧ E′∈ETrSetn(C ′ , σ′ ) ∧ E =e::E ′ ∨ (C, σ) −→ abort ∧ E =abort ∨ C =skip ∧ E =done} . We define ETrSet(C, σ) as ∪ n ETrSetn(C, σ). We overload the notation and use ETrSet(C, Σ) for the high-level language. Note that we treat abort as a specific behavior instead of undefined arbitrary behaviors. The choices should depend on applications. The ideas in the paper should also apply for the latter setting, though we need to change our refinement and simulation relations defined below. Then we define an event trace refinement as the subset relation between event trace sets, which is similar to Leroy’s refinement property [Leroy 2009]. Definition 3.2 (Event Trace Refinement). We say (C, σ) is an e-trace refinement of (C, Σ), i.e., (C, σ) ⊑ (C, Σ), if and only if ETrSet(C, σ) ⊆ ETrSet(C, Σ). The refinement is defined for program configurations instead of for code only because the initial states may affect the behaviors of programs. In this case, the transformation T should translate states as well as code. We overload the notation and use T(Σ) to represent the state transformation, and use C ⊑T C for ∀σ, Σ. σ = T(Σ) =⇒ (C, σ) ⊑ (C, Σ), then Correct(T) defined in formula (1.1) can be reformulated as Correct(T) , ∀C, C. C = T(C) =⇒ C ⊑T C . (3.1) From the above e-trace refinement definition, we can derive an e-trace equivalence relation by requiring both directions hold: (C, σ) ≈ (C, Σ) , (C, σ) ⊑ (C, Σ) ∧ (C, Σ) ⊑ (C, σ), and use C ≈T C for ∀σ, Σ. σ = T(Σ) =⇒ (C, σ) ≈ (C, Σ). 4. THE RGSIM RELATION The e-trace refinement is defined directly over the externally observable behaviors of programs. It is intuitive, and also abstract in that it is independent of language details. However, as we explained before, it is not compositional w.r.t. parallel compositions. In this section we propose RGSim, which can be viewed as a compositional proof technique that allows us to derive the simple e-trace refinement and then verify the corresponding transformation T. ACM Transactions on Programming Languages and Systems, Vol. V, No. N, Article A, Publication date: January YYYY
A:10 Hongjin Liang et al. 0 B R RMl (a)a-Related Transitions (b)The Side Condition of TRANS Fig.4.Related Transitions 4.1.The Definition Our co-inductively defined RGSim relation is in the form of (C,o,R,9) (C,E,R,G),which is a simulation between program configurations(C,o)and(C,>)It is parametrized with the rely and guarantee conditions at the low level and the high level,which are binary relations over states: R,gE P(LState x LState),R,G E P(HState x HState). The simulation also takes two additional parameters:the step invariant a and the post- condition y,which are both relations between the low-level and the high-level states. a,Y E P(LState x HState). Before we formally define RGSim in Definition 4.2,we first introduce the a-related transitions as follows. Definition 4.1 (a-Related Transitions). (R,)a{(a,σ),(E,)|(o,σ)∈RA(E,)∈RA(o,)∈aA(σ,)∈a}. (R,R)represents a set of the a-related transitions in R and R,putting together the corresponding transitions in R and R that can be related by a,as illustrated in Fig- ure 4(a).(g,G)is defined in the same way. Definition 4.2 (RGSim).Whenever (C,o,R,g):(C,,R,G),then (o,>)Ea and the following are true: (1)if (C,)>(C,'),then there exist C'and >such that (C,)-→*(C,),(a,o),(②,)∈(g,G)aand(C,o,R,g)≤an(C,y,R,G (2)if (C,)(C,o'),then there exist C'and such that (C,)e,*(C','),(o,o),(E,)∈(G,G)aand(C,o',R,g)≤ay(C,',R,G; (3)if C=skip,then there exists >such that (C,)-→*(skip,),(a,o),(,)∈(g,G*〉a,(a,)∈y andy≤a; (4)if(C,o)-→abort,then(C,)-→*abort; (⑤)if(o,a),(∑,)∈(R,R*)a,then(C,o',R,g)≤a(C,',R,G). Then,(C,R,g):(C,R,G)iff for all o and if (,)then (C.,R,g)(C.,R.G).Here the precondition E P(LState x HState)is used to relate the initial states o and > Informally,(C,a,R,g)(C,R,G)says the low-level configuration (C,o)is sim- ulated by the high-level configuration(C,>)with behaviors g and G respectively,no matter how their environments R and R interfere with them.It requires the following hold for every execution of C: ACM Transactions on Programming Languages and Systems,Vol.V,No.N,Article A,Publication date:January YYYY
A:10 Hongjin Liang et al. σ σ ′ Σ Σ ′ α ❄ R α ❄ R σ σ ′ θ θ ′ Σ Σ ′ α β ❄ R ❄ RM ❄ R ∗ α β (a) α-Related Transitions (b) The Side Condition of TRANS Fig. 4. Related Transitions 4.1. The Definition Our co-inductively defined RGSim relation is in the form of (C, σ, R, G) ≼α;γ (C, Σ, R, G), which is a simulation between program configurations (C, σ) and (C, Σ). It is parametrized with the rely and guarantee conditions at the low level and the high level, which are binary relations over states: R, G ∈ P(LState × LState), R, G ∈ P(HState × HState). The simulation also takes two additional parameters: the step invariant α and the postcondition γ, which are both relations between the low-level and the high-level states. α, γ ∈ P(LState × HState). Before we formally define RGSim in Definition 4.2, we first introduce the α-related transitions as follows. Definition 4.1 (α-Related Transitions). ⟨R, R⟩α , {((σ, σ′ ),(Σ, Σ ′ )) | (σ, σ′ ) ∈ R ∧ (Σ, Σ ′ ) ∈ R ∧ (σ, Σ) ∈ α ∧ (σ ′ , Σ ′ ) ∈ α} . ⟨R, R⟩α represents a set of the α-related transitions in R and R, putting together the corresponding transitions in R and R that can be related by α, as illustrated in Figure 4(a). ⟨G, G⟩α is defined in the same way. Definition 4.2 (RGSim). Whenever (C, σ, R, G) ≼α;γ (C, Σ, R, G), then (σ, Σ) ∈ α and the following are true: (1) if (C, σ) −→ (C ′ , σ′ ), then there exist C ′ and Σ ′ such that (C, Σ) −→∗ (C ′ , Σ ′ ), ((σ, σ′ ),(Σ, Σ ′ )) ∈ ⟨G, G∗ ⟩α and (C ′ , σ′ , R, G) ≼α;γ (C ′ , Σ ′ , R, G); (2) if (C, σ) e −→ (C ′ , σ′ ), then there exist C ′ and Σ ′ such that (C, Σ) e −→∗ (C ′ , Σ ′ ), ((σ, σ′ ),(Σ, Σ ′ )) ∈ ⟨G, G∗ ⟩α and (C ′ , σ′ , R, G) ≼α;γ (C ′ , Σ ′ , R, G); (3) if C = skip, then there exists Σ ′ such that (C, Σ) −→∗ (skip, Σ ′ ), ((σ, σ),(Σ, Σ ′ )) ∈ ⟨G, G∗ ⟩α, (σ, Σ ′ ) ∈ γ and γ ⊆ α; (4) if (C, σ) −→ abort, then (C, Σ) −→∗ abort; (5) if ((σ, σ′ ),(Σ, Σ ′ )) ∈ ⟨R, R ∗ ⟩α, then (C, σ′ , R, G) ≼α;γ (C, Σ ′ , R, G). Then, (C, R, G) ≼α;ζnγ (C, R, G) iff for all σ and Σ, if (σ, Σ) ∈ ζ, then (C, σ, R, G) ≼α;γ (C, Σ, R, G). Here the precondition ζ ∈ P(LState × HState) is used to relate the initial states σ and Σ. Informally, (C, σ, R, G) ≼α;γ (C, Σ, R, G) says the low-level configuration (C, σ) is simulated by the high-level configuration (C, Σ) with behaviors G and G respectively, no matter how their environments R and R interfere with them. It requires the following hold for every execution of C: ACM Transactions on Programming Languages and Systems, Vol. V, No. N, Article A, Publication date: January YYYY