Compositional Verification of Termination-Preserving Refinement of Concurrent Programs Hongjin Liangt Xinyu Fengt Zhong Shaot University of Science and Technology of China +Yale University lhj1018@mail.ustc.edu.cn xyfeng@ustc.edu.cn zhong.shao@yale.edu Abstract should termination of the source be preserved too by the target?If Many verification problems can be reduced to refinement verifica- yes,how to verify such refinement? Preservation of termination is an indispensable requirement tion.However,existing work on verifying refinement of concurrent programs either fails to prove the preservation of termination.al- in many refinement applications.For instance,compilation and lowing a diverging program to trivially refine any programs,or is optimizations are not allowed to transform a terminating source difficult to apply in compositional thread-local reasoning.In this program to a diverging (non-terminating)target.Also,implemen- paper,we first propose a new simulation technique,which estab- tations of concurrent data structures are often expected to have lishes termination-preserving refinement and is a congruence with progress guarantees (e.g.,lock-freedom and wait-freedom)in ad- respect to parallel composition.We then give a proof theory for the dition to linearizability.The requirements are equivalent to some simulation,which is the first Hoare-style concurrent program logic contextual refinements that preserve the termination of client pro- supporting termination-preserving refinement proofs.We show two grams [12]. key applications of our logic,i.e.,verifying linearizability and lock- Most existing approaches for verifying concurrent program freedom together for fine-grained concurrent objects,and verifying refinement,including simulations (e.g.,[11]),logical relations full correctness of optimizations of concurrent algorithms. (e.g..[221),and refinement logics (e.g.,[211).do not reason about the preservation of termination.As a result,a program that Categories and Subject Descriptors D.2.4 [Software Engineer- does an infinite loop without generating any external events,e.g ing]:Software/Program Verification-Correctness proofs,Formal while true do skip,would trivially refine any source program methods;F.3.1 [Logics and Meanings of Programs]:Specifying (just like that it trivially satisfies partial correctness in Hoare logic). and Verifying and Reasoning about Programs Certainly this kind of refinement is not acceptable in the applica- General Terms Theory,Verification tions mentioned above. CompCert [9]addresses the problem by introducing a well- Keywords Concurrency,Refinement,Termination Preservation, founded order in the simulation,but it works only for sequential Rely-Guarantee Reasoning.Simulation programs.It is difficult to apply this idea to do thread-local ver- ification of concurrent program refinement,which enables us to 1.Introduction know Ti Il T2 S1 ll S2 by proving T1 S1 and T2E S2 Verifying refinement between programs is the crux of many ver- In practice,the termination preservation in the refinement proofs of ification problems.For instance.reasoning about compilation or individual threads could be easily broken by the interference from their environments (i.e.,other threads running in parallel).For in- program transformations requires proving that every target pro- gram is a refinement of its source [9].In concurrent settings,re- stance,a method call of a lock-free data structure (e.g.,Treiber cent work [4.12]shows that the correctness of concurrent data stack)may never terminate when other threads call the methods structures and libraries can be characterized via some forms of con- and update the shared memory infinitely often.As we will explain in Sec.2,the key challenge here is to effectively specify the en textual refinements,i.e.,every client that calls the concrete library vironments'effects on the termination preservation of individual methods should refine the client with some abstract atomic oper- ations.Verification of concurrent garbage collectors [11]and OS threads.As far as we know,no previous work can use "composi- kernels [18]can also be reduced to refinement verification. tional"thread-local reasoning to verify termination-preserving re- finement between(whole)concurrent programs. Refinement from the source program S to the target T,written as T S,requires that T have no more observable behaviors In this paper,we first propose novel rely/guarantee conditions than S.Usually observable behaviors include the traces of external which can effectively specify the interference over the termina- tion properties between a thread and its environment.Traditional events such as 1/O operations and runtime errors.The question is rely/guarantee conditions [8]are binary relations of program states and they specify the state updates.We extend them with a boolean tag indicating whether a state update may let the thread or its envi- Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed ronment make more moves for profit or commercial advantage and that copies bear this notice and the full citation With the help of our new rely/guarantee conditions,we then on the first page.Copyrights for components of this work owned by others than ACM propose a new simulation RGSim-T,and a new Hoare-style pro- must be honored.Abstracting with credit is permitted.To copy otherwise.or republish gram logic,both of which support compositional verification to post on servers or to redistribute to lists,requires prior specific permission and/or a of termination-preserving refinement of concurrent programs fee.Request permissions from permissions@acm.org. Our work is based on our previous compositional simulation RGSim [11](which unfortunately cannot preserve termination), http:/ldk.doi.org/10.1145/2603088.2603123 and is inspired by Hoffmann et al.'s program logic for lock-
Compositional Verification of Termination-Preserving Refinement of Concurrent Programs Hongjin Liang† Xinyu Feng† Zhong Shao‡ †University of Science and Technology of China ‡Yale University lhj1018@mail.ustc.edu.cn xyfeng@ustc.edu.cn zhong.shao@yale.edu Abstract Many verification problems can be reduced to refinement verification. However, existing work on verifying refinement of concurrent programs either fails to prove the preservation of termination, allowing a diverging program to trivially refine any programs, or is difficult to apply in compositional thread-local reasoning. In this paper, we first propose a new simulation technique, which establishes termination-preserving refinement and is a congruence with respect to parallel composition. We then give a proof theory for the simulation, which is the first Hoare-style concurrent program logic supporting termination-preserving refinement proofs. We show two key applications of our logic, i.e., verifying linearizability and lockfreedom together for fine-grained concurrent objects, and verifying full correctness of optimizations of concurrent algorithms. Categories and Subject Descriptors D.2.4 [Software Engineering]: Software/Program Verification – Correctness proofs, Formal methods; F.3.1 [Logics and Meanings of Programs]: Specifying and Verifying and Reasoning about Programs General Terms Theory, Verification Keywords Concurrency, Refinement, Termination Preservation, Rely-Guarantee Reasoning, Simulation 1. Introduction Verifying refinement between programs is the crux of many verification problems. For instance, reasoning about compilation or program transformations requires proving that every target program is a refinement of its source [9]. In concurrent settings, recent work [4, 12] shows that the correctness of concurrent data structures and libraries can be characterized via some forms of contextual refinements, i.e., every client that calls the concrete library methods should refine the client with some abstract atomic operations. Verification of concurrent garbage collectors [11] and OS kernels [18] can also be reduced to refinement verification. Refinement from the source program S to the target T , written as T ⊑ S, requires that T have no more observable behaviors than S. Usually observable behaviors include the traces of external events such as I/O operations and runtime errors. The question is, Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from permissions@acm.org. CSL-LICS 2014, July 14–18, 2014, Vienna, Austria. Copyright c 2014 ACM 978-1-4503-2886-9. . . $15.00. http://dx.doi.org/10.1145/2603088.2603123 should termination of the source be preserved too by the target? If yes, how to verify such refinement? Preservation of termination is an indispensable requirement in many refinement applications. For instance, compilation and optimizations are not allowed to transform a terminating source program to a diverging (non-terminating) target. Also, implementations of concurrent data structures are often expected to have progress guarantees (e.g., lock-freedom and wait-freedom) in addition to linearizability. The requirements are equivalent to some contextual refinements that preserve the termination of client programs [12]. Most existing approaches for verifying concurrent program refinement, including simulations (e.g., [11]), logical relations (e.g., [22]), and refinement logics (e.g., [21]), do not reason about the preservation of termination. As a result, a program that does an infinite loop without generating any external events, e.g. while true do skip, would trivially refine any source program (just like that it trivially satisfies partial correctness in Hoare logic). Certainly this kind of refinement is not acceptable in the applications mentioned above. CompCert [9] addresses the problem by introducing a wellfounded order in the simulation, but it works only for sequential programs. It is difficult to apply this idea to do thread-local verification of concurrent program refinement, which enables us to know T1 k T2 ⊑ S1 k S2 by proving T1 ⊑ S1 and T2 ⊑ S2. In practice, the termination preservation in the refinement proofs of individual threads could be easily broken by the interference from their environments (i.e., other threads running in parallel). For instance, a method call of a lock-free data structure (e.g., Treiber stack) may never terminate when other threads call the methods and update the shared memory infinitely often. As we will explain in Sec. 2, the key challenge here is to effectively specify the environments’ effects on the termination preservation of individual threads. As far as we know, no previous work can use “compositional” thread-local reasoning to verify termination-preserving re- finement between (whole) concurrent programs. In this paper, we first propose novel rely/guarantee conditions which can effectively specify the interference over the termination properties between a thread and its environment. Traditional rely/guarantee conditions [8] are binary relations of program states and they specify the state updates. We extend them with a boolean tag indicating whether a state update may let the thread or its environment make more moves. With the help of our new rely/guarantee conditions, we then propose a new simulation RGSim-T, and a new Hoare-style program logic, both of which support compositional verification of termination-preserving refinement of concurrent programs. Our work is based on our previous compositional simulation RGSim [11] (which unfortunately cannot preserve termination), and is inspired by Hoffmann et al.’s program logic for lock-
(a)source code S: X++; 1 local t,done :false; s--4s S1- →S2 2 while(!done)C 1 local t; 3 t:=x; or 2 done :cas(&x,t,t+1); TI T1→T2→T3→T4 3x:=t+1; 5] (b)target code T (c)target code Te (with T<T) (with T2<Ti|and |Tal T3|) (a)T<S b)R上T<'S Figure 1.Counters. Figure 2.Simulation diagrams freedom [7](which does not support refinement verification and has limitations on local reasoning,as we will explain in Sec.7),but finement and then discuss its problems in termination-preserving makes the following new contributions: concurrent refinement verification. Fig.1(a)shows the source code S that increments x.In a We design a simulation,RGSim-T,to verify termination- sequential setting,it can be implemented as T in Fig.1(b).To preserving refinement of concurrent programs.As an exten- show that T refines S,a natural way is to prove that they satisfy sion of RGSim,it considers the interference between threads the (weak)simulation in Fig.2(a). and the environments by taking our novel rely/guarantee condi- The simulation first establishes some consistency relation be- tions as parameters.RGSim-T is compositional.It allows us to tween the source and the target (note S and T here are whole pro- thread-locally reason about the preservation of whole-program gram configurations consisting of both code and states).Then it termination,but without enforcing the preservation of individ- requires that there is some correspondence between the execution ual threads'termination,thus can be applied to many practical of the target and the source so that the relation is always preserved refinement applications. Every execution step of the target must either correspond to one We propose the first program logic that supports compositional or more steps of the source (the left part of Fig.2(a)),or corre verification of termination-preserving refinement of concurrent spond to zero steps(the right part;Let's ignore the requirement of programs.In addition to a set of compositionality (binary rea- IT<T]for now).! soning)rules,we also provide a set of unary rules (built upon For our example in Fig.1,the simulation requires that x at the the unary program logic LRG [3])that can reason about con- target level have the same value with x in the source.We let line 2 ditional correspondence between the target and the source,a at T correspond to zero steps of S,and line 3 correspond to the usual situation in concurrent refinement (see Sec.2).The logic single step of S. enables compositional verification of nested loops and sup- Such a simulation,however,has two problems for termination- ports programs with infinite nondeterminism.The soundness of preserving concurrent refinement verification.First,it does not the logic ensures RGSim-T between the target and the source, require the target to preserve the termination of the source.Since which implies the termination-preserving refinement. a silent step at the target level may correspond to zero steps at the source (the right part of Fig.2(a)),the target may execute such Our simulation and logic are general.They can be applied steps infinitely many times and never correspond to a step at the to verify linearizability and lock-freedom together for fine- source.For instance,if we insert while true do skip before grained concurrent objects,or to verify the full correctness of line 2 in T.the simulation still holds,but T diverges now.To optimizations of concurrent programs,i.e.,the optimized pro- address this problem.CompCert [9]introduces a metric T over gram preserves behaviors on both functionality and termination the target program configurations,which is equipped with a well- of the original one.We demonstrate the effectiveness of our founded order <If a target step corresponds to no moves of the logic by verifying linearizability and lock-freedom of Treiber source.the metric over the target programs should strictly decrease stack [20],Michael-Scott queue [14]and DGLM queue [2],the (i.e.,the condition T<T in Fig.2(a)).Since the well-founded full correctness of synchronous queue [16]and the equivalence order ensures that there are no infinite decreasing chains,execution between TTAS lock and TAS lock implementations [6]. of the target will finally correspond to at least one step at the source It is important to note that our simulation and logic ensure that Second,it is not compositional w.r.t.parallel compositions. the target preserves the termination/divergence behaviors of the Though T S holds,(T ll T):print()(Sll S);print() source.The target could diverge if the source diverges.Therefore does not hold since the left side may print out 1,which is im- our logic is not for verifying total correctness (i.e.,partial correct- possible for the source on the right.The problem is that when we ness termination).It is actually more powerful and general.We prove TS.T and S are viewed as closed programs and the give more discussions on this point in Secs.2.2 and 5.2. interference from environments is ignored.To get the parallel com- In the rest of this paper,we first analyze the challenges and positionality,we follow the ideas in our previous work RGSim [11] explain our approach informally in Sec.2.Then we formulate and parameterize the simulation with the interference between the termination-preserving refinement in Sec.3.We present our new programs and their environments. simulation RGSim-T in Sec.4 and our new program logic in Sec.5 As shown in Fig.2(b),the new simulation <is parameterized We summarize examples that we verified in Sec.6,and discuss the with the environment interference R,i.e.the set of all possible related work and conclude in Sec.7. transitions of the environments at the target and source levels.Here we use thin arrows for the transitions of the current thread at the source and the target levels (e.g.,from T to T2 and from T3 to T4 2.Informal Development in Fig.2(b)),and thick arrows for the possible environment steps Below we informally explain the challenges and our solutions in (e.g.,from T2 to T3 and from S1 to S2 in the figure).We require our design of the simulation and the logic respectively. the simulation to be preserved by R. 2.1 Simulation INote here we only discuss silent steps (a.k.a.T-steps)which produce no external events.The simulation also requires that every step with an extemal Simulation is a standard technique for refinement verification.We event at the target level must correspond to one step at the source with the start by showing a simple simulation for verifying sequential re- same event plus zero or multiple T-steps
(a) source code S: x++; 1 local t; 2 t := x; 3 x := t + 1; (b) target code Tb 1 local t, done := false; 2 while (! done) { 3 t := x; 4 done := cas(&x, t, t+1); 5 } (c) target code Tc Figure 1. Counters. freedom [7] (which does not support refinement verification and has limitations on local reasoning, as we will explain in Sec. 7), but makes the following new contributions: • We design a simulation, RGSim-T, to verify terminationpreserving refinement of concurrent programs. As an extension of RGSim, it considers the interference between threads and the environments by taking our novel rely/guarantee conditions as parameters. RGSim-T is compositional. It allows us to thread-locally reason about the preservation of whole-program termination, but without enforcing the preservation of individual threads’ termination, thus can be applied to many practical refinement applications. • We propose the first program logic that supports compositional verification of termination-preserving refinement of concurrent programs. In addition to a set of compositionality (binary reasoning) rules, we also provide a set of unary rules (built upon the unary program logic LRG [3]) that can reason about conditional correspondence between the target and the source, a usual situation in concurrent refinement (see Sec. 2). The logic enables compositional verification of nested loops and supports programs with infinite nondeterminism. The soundness of the logic ensures RGSim-T between the target and the source, which implies the termination-preserving refinement. • Our simulation and logic are general. They can be applied to verify linearizability and lock-freedom together for finegrained concurrent objects, or to verify the full correctness of optimizations of concurrent programs, i.e., the optimized program preserves behaviors on both functionality and termination of the original one. We demonstrate the effectiveness of our logic by verifying linearizability and lock-freedom of Treiber stack [20], Michael-Scott queue [14] and DGLM queue [2], the full correctness of synchronous queue [16] and the equivalence between TTAS lock and TAS lock implementations [6]. It is important to note that our simulation and logic ensure that the target preserves the termination/divergence behaviors of the source. The target could diverge if the source diverges. Therefore our logic is not for verifying total correctness (i.e., partial correctness + termination). It is actually more powerful and general. We give more discussions on this point in Secs. 2.2 and 5.2. In the rest of this paper, we first analyze the challenges and explain our approach informally in Sec. 2. Then we formulate termination-preserving refinement in Sec. 3. We present our new simulation RGSim-T in Sec. 4 and our new program logic in Sec. 5. We summarize examples that we verified in Sec. 6, and discuss the related work and conclude in Sec. 7. 2. Informal Development Below we informally explain the challenges and our solutions in our design of the simulation and the logic respectively. 2.1 Simulation Simulation is a standard technique for refinement verification. We start by showing a simple simulation for verifying sequential reS S ′ T T ′ + or S T T ′ (with |T ′ | < |T|) (a) T S S1 T1 T2 S2 T3 T4 ′ ′ R ′ ′ (with |T2| < |T1| and |T4| < |T3|) (b) R ⊢ T ′ S Figure 2. Simulation diagrams. finement and then discuss its problems in termination-preserving concurrent refinement verification. Fig. 1(a) shows the source code S that increments x. In a sequential setting, it can be implemented as Tb in Fig. 1(b). To show that Tb refines S, a natural way is to prove that they satisfy the (weak) simulation in Fig. 2(a). The simulation first establishes some consistency relation between the source and the target (note S and T here are whole program configurations consisting of both code and states). Then it requires that there is some correspondence between the execution of the target and the source so that the relation is always preserved. Every execution step of the target must either correspond to one or more steps of the source (the left part of Fig. 2(a)), or correspond to zero steps (the right part; Let’s ignore the requirement of |T ′ | < |T | for now).1 For our example in Fig. 1, the simulation requires that x at the target level have the same value with x in the source. We let line 2 at Tb correspond to zero steps of S, and line 3 correspond to the single step of S. Such a simulation, however, has two problems for terminationpreserving concurrent refinement verification. First, it does not require the target to preserve the termination of the source. Since a silent step at the target level may correspond to zero steps at the source (the right part of Fig. 2(a)), the target may execute such steps infinitely many times and never correspond to a step at the source. For instance, if we insert while true do skip before line 2 in Tb, the simulation still holds, but Tb diverges now. To address this problem, CompCert [9] introduces a metric |T | over the target program configurations, which is equipped with a wellfounded order <. If a target step corresponds to no moves of the source, the metric over the target programs should strictly decrease (i.e., the condition |T ′ | < |T | in Fig. 2(a)). Since the well-founded order ensures that there are no infinite decreasing chains, execution of the target will finally correspond to at least one step at the source. Second, it is not compositional w.r.t. parallel compositions. Though Tb S holds, (Tb k Tb); print(x) (S k S); print(x) does not hold since the left side may print out 1, which is impossible for the source on the right. The problem is that when we prove Tb S, Tb and S are viewed as closed programs and the interference from environments is ignored. To get the parallel compositionality, we follow the ideas in our previous work RGSim [11] and parameterize the simulation with the interference between the programs and their environments. As shown in Fig. 2(b), the new simulation ′ is parameterized with the environment interference R, i.e. the set of all possible transitions of the environments at the target and source levels. Here we use thin arrows for the transitions of the current thread at the source and the target levels (e.g., from T1 to T2 and from T3 to T4 in Fig. 2(b)), and thick arrows for the possible environment steps (e.g., from T2 to T3 and from S1 to S2 in the figure). We require the simulation ′ to be preserved by R. 1 Note here we only discuss silent steps (a.k.a. τ-steps) which produce no external events. The simulation also requires that every step with an external event at the target level must correspond to one step at the source with the same event plus zero or multiple τ-steps
Then,to prove termination-preserving concurrent refinement,it source level.Our new simulation RGSim-T extends RGSim by in seems natural to combine the two ideas and have a simulation pa corporating the idea of metrics to achieve termination preservation rameterized with environment interference and a metric decreasing It is parameterized with the new rely (and guarantee)conditions so for target steps that correspond to no steps at the source.Therefore that we know how an environment step could affect the metric.The we require T2<T1 and T4<T3 in the case of Fig.2(b).But formal definition of RGSim-T is given in Sec.4. how would the environment steps change the metric? Relationships to lock-freedom,obstruction-freedom and wait- First attempt.Our first attempt to answer this question is to allow freedom.If the source program is just a single atomic opera- environment steps to arbitrarily change the metrics associated with tion (e.g.x++),our new simulation RGSim-T can be viewed as a the target program configurations.Therefore it is possible to have proof technique for lock-freedom of the target,which ensures that T2<Ts in Fig.2(b). there always exists some thread that will complete an operation at The resulting simulation,however,is still not compositional the source level in a finite number of steps.That is,the failure of w.r.t.parallel compositions.For instance,for the following two a thread to finish its operation must be caused by the successful threads in the target program: completion of source operations by its environment. In fact,the simulations of our first and second attempts can whi1e(i==0)i-;‖wh11e(i=0)1+; be viewed as proof techniques for obstruction-freedom and wait- we can prove that this simulation holds between each of them and freedom respectively of concurrent objects.Obstruction-freedom the source program skip,if we view i as local data used only ensures that every thread will complete its operation whenever it is at the target level.We could define the metric as 1 if i=0 executed in isolation (i.e..without interference from other threads). and 0 otherwise.For the left thread.it decreases the metric if it In the simulation of our first attempt,though a thread is allowed to executes the loop body.The increment of i by its environment not make progress under environment interference,it has to com- (the right thread)may change i back to 0,increasing the metric. plete some source operations when its environments do not inter- This is allowed in our simulation.The case for the right thread is fere.Wait-freedom ensures the completion of the operation of any symmetric.However,if we view the parallel composition of the two thread.Correspondingly in the simulation of our second attempt,a threads as a whole program,it may not terminate,thus cannot be a thread has to make progress no matter what the environment does termination-preserving refinement of skipskip 2.2 Program Logic Second attempt.The first attempt is too permissive to have par- The compositionality of our new simulation RGSim-T allows us allel compositionality,because we allow a thread to make more to decompose the refinement for large programs to refinements moves whenever its environment interferes with it.Thus our sec- ond attempt enforces the metric of a thread to decrease or stay un- for small program units,therefore we could derive a set of syn- tactic Hoare-style rules for refinement verification,as we did for changed under environment interference.For the case of Fig.2(b) we require T3<Tl on environment steps. RGSim [11].For instance,a sequential composition rule may be in This simulation is compositional,but it is too strong and can- the following form: not be satisfied by many useful refinements.For instance.T.in RH(P)T3S1(P}RH(P)T3S2{Q} Fig.I(c)uses a compare-and-swap (cas)instruction to atomically RF{P}T1;T2≤S1;S2{Q} update x.It is a correct lock-free implementation of S in concur- rent settings,but the new simulation of our second attempt does Here we use RHPIT<SO}to represent the corresponding not hold between T and S.If an environment step between lines 3 syntactic judgment of RGSim-T.R denotes the environment inter and 4 of T increments x,the cas at line 4 will return false and T ference.P,and P'are relational assertions that relate the pro- needs to execute another round of loop.Therefore such an environ- gram states at the target and the source levels.The rule says if we ment step increases the number of silent steps of Te that correspond could establish refinements (in fact,RGSim-Ts)between T and to no moves of S.However,our new simulation does not allow an S1,and between T2 and S2,we know T1:T2 refines S1:S2.We environment step to increase the metric,so the simulation cannot could give similar rules for parallel composition and other compo- be established. sitional commands However,in many cases the correspondence between program Our solution.Our solution lies in the middle ground of the two units at the target and the source levels cannot be determined failed attempts.We specify explicitly in the parameter R which statically.That is,just by looking at T;T2 and S1;S2,we may environment steps may make the current thread move more (i.e. not know statically that T refines S and T2 refines S2 and then allow the thread's metric to increase in the simulation).Here we apply the above sequential composition rule.To see the problem. distinguish in R the steps that correspond to source level moves we unfold the while-loop of Te in Fig.I and get the following T from those that do not.We allow the metric to be increased by the former (as in our first attempt),but not by the latter (which must local t,done; 4 while (!done){ 2 decrease or preserve the metric,as in our second attempt). t:=x; 5 t:=X; 3 done :cas(&x,t,t+1);6 done :cas(&x,t,t+1); This approach is based on the observation that the failure of 7] cas in Te of Fig.I(c)must be caused by an environment step that successfully increments x,which corresponds to a step at the Clearly T refines S too.However,whether the cas instruction at source level.Although the termination of the current thread Te is line 3 fulfils the operation in S or not depends on whether the com- delayed,the whole system consisting of both the current thread and parison succeeds in runtime.Thus we cannot apply the composi- the environment progresses by making a corresponding step at the tionality rules for RGSim-T to decompose the refinement about T source level.Therefore,the delay of the termination of the current We have to refer to the semantics of the simulation definition to thread should be acceptable,and we should allow such environment prove the refinement,which would be rather ineffective for large steps to increase the metric of the current thread scale programs.Similar issues also show up in our earlier work on In this paper,we follow the idea of rely/guarantee reasoning [8 RGSim [11],and in relational Hoare logic [1]and relational sepa- and use the rely condition to specify environment steps.However ration logic [25]if they are applied to concurrent settings we extend the traditional rely conditions with an extra boolean tag To address this problem,we extend the assertion language to indicating whether an environment step corresponds to a step at the specify as auxiliary state the source code remaining to be refined
Then, to prove termination-preserving concurrent refinement, it seems natural to combine the two ideas and have a simulation parameterized with environment interference and a metric decreasing for target steps that correspond to no steps at the source. Therefore we require |T2| < |T1| and |T4| < |T3| in the case of Fig. 2(b). But how would the environment steps change the metric? First attempt. Our first attempt to answer this question is to allow environment steps to arbitrarily change the metrics associated with the target program configurations. Therefore it is possible to have |T2| < |T3| in Fig. 2(b). The resulting simulation, however, is still not compositional w.r.t. parallel compositions. For instance, for the following two threads in the target program: while(i==0) i--; k while(i==0) i++; we can prove that this simulation holds between each of them and the source program skip, if we view i as local data used only at the target level. We could define the metric as 1 if i = 0 and 0 otherwise. For the left thread, it decreases the metric if it executes the loop body. The increment of i by its environment (the right thread) may change i back to 0, increasing the metric. This is allowed in our simulation. The case for the right thread is symmetric. However, if we view the parallel composition of the two threads as a whole program, it may not terminate, thus cannot be a termination-preserving refinement of skipkskip. Second attempt. The first attempt is too permissive to have parallel compositionality, because we allow a thread to make more moves whenever its environment interferes with it. Thus our second attempt enforces the metric of a thread to decrease or stay unchanged under environment interference. For the case of Fig. 2(b), we require |T3| ≤ |T2| on environment steps. This simulation is compositional, but it is too strong and cannot be satisfied by many useful refinements. For instance, Tc in Fig. 1(c) uses a compare-and-swap (cas) instruction to atomically update x. It is a correct lock-free implementation of S in concurrent settings, but the new simulation of our second attempt does not hold between Tc and S. If an environment step between lines 3 and 4 of Tc increments x, the cas at line 4 will return false and Tc needs to execute another round of loop. Therefore such an environment step increases the number of silent steps of Tc that correspond to no moves of S. However, our new simulation does not allow an environment step to increase the metric, so the simulation cannot be established. Our solution. Our solution lies in the middle ground of the two failed attempts. We specify explicitly in the parameter R which environment steps may make the current thread move more (i.e., allow the thread’s metric to increase in the simulation). Here we distinguish in R the steps that correspond to source level moves from those that do not. We allow the metric to be increased by the former (as in our first attempt), but not by the latter (which must decrease or preserve the metric, as in our second attempt). This approach is based on the observation that the failure of cas in Tc of Fig. 1(c) must be caused by an environment step that successfully increments x, which corresponds to a step at the source level. Although the termination of the current thread Tc is delayed, the whole system consisting of both the current thread and the environment progresses by making a corresponding step at the source level. Therefore, the delay of the termination of the current thread should be acceptable, and we should allow such environment steps to increase the metric of the current thread. In this paper, we follow the idea of rely/guarantee reasoning [8] and use the rely condition to specify environment steps. However, we extend the traditional rely conditions with an extra boolean tag indicating whether an environment step corresponds to a step at the source level. Our new simulation RGSim-T extends RGSim by incorporating the idea of metrics to achieve termination preservation. It is parameterized with the new rely (and guarantee) conditions so that we know how an environment step could affect the metric. The formal definition of RGSim-T is given in Sec. 4. Relationships to lock-freedom, obstruction-freedom and waitfreedom. If the source program is just a single atomic operation (e.g. x++), our new simulation RGSim-T can be viewed as a proof technique for lock-freedom of the target, which ensures that there always exists some thread that will complete an operation at the source level in a finite number of steps. That is, the failure of a thread to finish its operation must be caused by the successful completion of source operations by its environment. In fact, the simulations of our first and second attempts can be viewed as proof techniques for obstruction-freedom and waitfreedom respectively of concurrent objects. Obstruction-freedom ensures that every thread will complete its operation whenever it is executed in isolation (i.e., without interference from other threads). In the simulation of our first attempt, though a thread is allowed to not make progress under environment interference, it has to complete some source operations when its environments do not interfere. Wait-freedom ensures the completion of the operation of any thread. Correspondingly in the simulation of our second attempt, a thread has to make progress no matter what the environment does. 2.2 Program Logic The compositionality of our new simulation RGSim-T allows us to decompose the refinement for large programs to refinements for small program units, therefore we could derive a set of syntactic Hoare-style rules for refinement verification, as we did for RGSim [11]. For instance, a sequential composition rule may be in the following form: R ⊢ {P}T1 S1{P ′} R ⊢ {P ′}T2 S2{Q} R ⊢ {P}T1; T2 S1; S2{Q} Here we use R ⊢ {P}T S{Q} to represent the corresponding syntactic judgment of RGSim-T. R denotes the environment interference. P, Q and P ′ are relational assertions that relate the program states at the target and the source levels. The rule says if we could establish refinements (in fact, RGSim-Ts) between T1 and S1, and between T2 and S2, we know T1; T2 refines S1; S2. We could give similar rules for parallel composition and other compositional commands. However, in many cases the correspondence between program units at the target and the source levels cannot be determined statically. That is, just by looking at T1; T2 and S1; S2, we may not know statically that T1 refines S1 and T2 refines S2 and then apply the above sequential composition rule. To see the problem, we unfold the while-loop of Tc in Fig. 1 and get the following T ′ c: 1 local t, done; 4 while (!done) { 2 t := x; 5 t := x; 3 done := cas(&x,t,t+1); 6 done := cas(&x,t,t+1); 7 } Clearly T ′ c refines S too. However, whether the cas instruction at line 3 fulfils the operation in S or not depends on whether the comparison succeeds in runtime. Thus we cannot apply the compositionality rules for RGSim-T to decompose the refinement about T ′ c . We have to refer to the semantics of the simulation definition to prove the refinement, which would be rather ineffective for large scale programs. Similar issues also show up in our earlier work on RGSim [11], and in relational Hoare logic [1] and relational separation logic [25] if they are applied to concurrent settings. To address this problem, we extend the assertion language to specify as auxiliary state the source code remaining to be refined
In addition to the binary judgment R[P)T0)1--; The key to verifying the preservation of termination is the rule Assume the environment R may arbitrarily update i when x is not for while loops.One may first think of the total correctness rule for 0,but does not change anything when x is 0.We hope to verify C while loops in Hoare-style logics (e.g.,[19]).However,preserving refines skip.We can see that the loop in C must terminate (thus the the termination does not necessarily mean that the code must termi- refinement holds).and the number n of tokens must be no less than nate,and the total correctness rule would not be applicable in many the value of i at the beginning of the loop.But we cannot decide cases.For example,the following T and S'never terminate: the value of n before executing x :=0.This example cannot be T": S': verified if we have to predetermine and specify the metric for the local t; while loops at the very beginning of the whole program. while (true){ To address this issue,we introduce the following hiding rule: while (true) t:=x; x++; RHip)Cig cas(&x,t,t+1); RHIplw}Cilglw) Here pw discards all the knowledge about tokens in p.For the but T<S'holds for our RGSim-T-Every iteration of T above example,we can hide the number of tokens after we verify either corresponds to a step of S',or is interfered by environment the while loop.Then we do not need to specify the number of steps corresponding to source moves. tokens in the precondition of the whole program.We formally Inspired by Hoffmann et al.'s logic for lock-freedom [7],we present the set of logic rules in Sec.5. introduce a counter n(i.e.the number of tokens assigned to the current thread)as a while-specific metric,which means the thread 3. can only run the loop for no more than n rounds before it or its Formal Settings and Termination-Preserving environment fulfils one or more source-level moves.The counter Refinement is treated as an auxiliary state,and decreases at the beginning of In this section,we define the termination-preserving refinement C, every round of loop (i.e..we pay one token for each iteration). which is the proof goal of our RGSim-T and logic. If we reach a step in the loop body that corresponds to source moves,we could reset the counter to increase the number of tokens. 3.1 The Language Tokens could also increase under environment interference if the environment step corresponds to source moves.Correspondingly Fig.3 shows the programming language for both the source and our WHILE rule is in the following form (we give a simplified the target levels.We model the program semantics as a labeled version to demonstrate the idea here.The actual rule is given in transition system.A label t that will be associated with a state Sec.5): transition is either an event e or T.The latter marks a silent step generating no events. PAB P'*wf(1)RH(P)CIP) A state o is a pair of a store and a heap.The store s is a fi- RHP)while (B)CIPA-B) nite partial mapping from program variables to values (e.g.,inte- gers and memory addresses)and a heap h maps memory addresses We use wf(1)to represent one token,and"+"for normal sepa to values.Statements C are either primitive instructions or compo- rating conjunction in separation logic.To verify the loop body C sitions of them.A single-step execution of statements is modeled we use the precondition P',which has one less token than P,show- as a labeled transition:(C,)(C,').We abstract away the ing that one token has been consumed to start this new round of form of an instruction c.It may generate an external event (e.g., loop.During the execution of C,the number of token could be in- print(E)generates an output event).It may be non-deterministic creased if C itself or its environment steps correspond to source (e.g.,x :nondet assigns a random value to x).It may also be moves.As usual,the loop invariant P needs to be re-established at blocked at some states (e.g.,requesting a lock).We assume prim- the end of C. itive instructions are atomic in the semantics.We also provide an
In addition to the binary judgment R ⊢ {P}T S{Q}, we introduce a unary judgment in the form of R ⊢ {P ∧ arem(S)}T {Q ∧ arem(S ′ )} for refinements that cannot be decomposed. Here arem(S) means S is the remaining source to be refined by the target. Then R, G ⊢ {P ∧ arem(S)}T {Q ∧ arem(skip)} says that T refines S, since the postcondition shows at the end of the target T there are no remaining operations from S to be refined. We provide the following rule to derive the binary judgment from the unary one: R ⊢ {P ∧ arem(S)}T{Q ∧ arem(skip)} R ⊢ {P}T S{Q} On the other hand, if the final remaining source is the same as the initial one, we know the execution steps of the target correspond to zero source steps. Then for the T ′ c above, we can give pre- and post-conditions for line 3 as follows: {· · · ∧ arem(S)} done := cas(&x, t, t+1) {· · · ∧ (done ∧ arem(skip) ∨ ¬done ∧ arem(S))} As the postcondition shows, whether the cas instruction refines S or not is now conditional upon the value of done. Thanks to the new assertions arem(S), we can reduce the relational and semantic refinement proofs to unary and syntactic Hoare-style reasoning. The key to verifying the preservation of termination is the rule for while loops. One may first think of the total correctness rule for while loops in Hoare-style logics (e.g., [19]). However, preserving the termination does not necessarily mean that the code must terminate, and the total correctness rule would not be applicable in many cases. For example, the following T ′′ c and S ′ never terminate: T ′′ c : S ′ : local t; while (true){ while (true) t := x; x++; cas(&x, t, t+1); } but T ′′ c S ′ holds for our RGSim-T () — Every iteration of T ′′ c either corresponds to a step of S ′ , or is interfered by environment steps corresponding to source moves. Inspired by Hoffmann et al.’s logic for lock-freedom [7], we introduce a counter n (i.e. the number of tokens assigned to the current thread) as a while-specific metric, which means the thread can only run the loop for no more than n rounds before it or its environment fulfils one or more source-level moves. The counter is treated as an auxiliary state, and decreases at the beginning of every round of loop (i.e., we pay one token for each iteration). If we reach a step in the loop body that corresponds to source moves, we could reset the counter to increase the number of tokens. Tokens could also increase under environment interference if the environment step corresponds to source moves. Correspondingly our WHILE rule is in the following form (we give a simplified version to demonstrate the idea here. The actual rule is given in Sec. 5): P ∧ B ⇒ P ′ ∗ wf(1) R ⊢ {P ′}C{P} R ⊢ {P}while (B) C{P ∧ ¬B} We use wf(1) to represent one token, and “∗” for normal separating conjunction in separation logic. To verify the loop body C, we use the precondition P ′ , which has one less token than P, showing that one token has been consumed to start this new round of loop. During the execution of C, the number of token could be increased if C itself or its environment steps correspond to source moves. As usual, the loop invariant P needs to be re-established at the end of C. (Event) e ::= . . . (Label) ι ::= e | τ (Store) s, s ∈ PVar ⇀ Val (Heap) h, h ∈ Addr ⇀ Val (State) σ, Σ ::= (s, h) (Instr) c, ::= . . . (Expr) E,E ::= x | n | E + E | . . . (BExp) B, B ::= true | false | E = E | !B | . . . (Stmt) C, C ::= skip | c | hCi | C1; C2 | if (B) C1 else C2 | while (B) C | C1 kC2 Figure 3. Generic language at target and source levels. To prove that T ′′ c shown above preserves the termination of S ′ , we set the initial number of tokens to 1. We use up the token at the first iteration, but could gain another token during the iteration (either by self moves or by environment steps) to pay for the next iteration. We can see that the above reasoning with tokens coincides with the direct refinement proof in our simulation RGSim-T. In fact, RGSim-T can serve as the meta-theory of our logic. The use of tokens as an explicit metric for termination reasoning poses another challenge, which is to handle infinite nondeterminism. Consider the following target C. C: x := 0; while(i > 0) i--; Assume the environment R may arbitrarily update i when x is not 0, but does not change anything when x is 0. We hope to verify C refines skip. We can see that the loop in C must terminate (thus the refinement holds), and the number n of tokens must be no less than the value of i at the beginning of the loop. But we cannot decide the value of n before executing x := 0. This example cannot be verified if we have to predetermine and specify the metric for the while loops at the very beginning of the whole program. To address this issue, we introduce the following hiding rule: R ⊢ {p}C{q} R ⊢ {⌊p⌋w}C{⌊q⌋w} Here ⌊p⌋w discards all the knowledge about tokens in p. For the above example, we can hide the number of tokens after we verify the while loop. Then we do not need to specify the number of tokens in the precondition of the whole program. We formally present the set of logic rules in Sec. 5. 3. Formal Settings and Termination-Preserving Refinement In this section, we define the termination-preserving refinement ⊑, which is the proof goal of our RGSim-T and logic. 3.1 The Language Fig. 3 shows the programming language for both the source and the target levels. We model the program semantics as a labeled transition system. A label ι that will be associated with a state transition is either an event e or τ . The latter marks a silent step generating no events. A state σ is a pair of a store and a heap. The store s is a fi- nite partial mapping from program variables to values (e.g., integers and memory addresses) and a heap h maps memory addresses to values. Statements C are either primitive instructions or compositions of them. A single-step execution of statements is modeled as a labeled transition: (C, σ) ι −→ (C ′ , σ′ ). We abstract away the form of an instruction c. It may generate an external event (e.g., print(E) generates an output event). It may be non-deterministic (e.g., x := nondet assigns a random value to x). It may also be blocked at some states (e.g., requesting a lock). We assume primitive instructions are atomic in the semantics. We also provide an
(C,a)-→+abort (C,a)+(C',d)ET(C,d,) (RelAssn)P,Q,I:=B I own(x)I emp IE→E|E片E P*QI PVQp... ETr(C,o,) ETr(C.o,e::E) (FullAssn)p.q :=P I arem(C)I wf(E)p]a Lp]w (C,a)→*(skip,) (C,a)→+(C,a) ETr(C,o',8) p*9 pV q... ETr(C,o,) ETr(C,o,E) (RelAct)R,G ::[P]I PxQ I P&QI R+R I R+.. Figure 4.Co-inductive definition of ETr(C,o,E). Figure 5.Assertion language. atomic block (C)to execute a piece of code C atomically.Then the source levels from which the simulation holds,and Q is about the generic language in Fig.3 is expressive enough for the source the pair of final states when the target and the source terminate.So and the target programs which may have different granularities of before we give our definition of RGSim-T,we first introduce our state accesses.Due to the space limit,the operational semantics and assertion language more details about the language are formally presented in TR [13] 4.1 Conventions.We usually write blackboard bold or capital letters Assertions and New Rely/Guarantee Conditions (s,h,C,E,B and C)for the notations at the source level to We show the syntax of the basic assertion language in Fig.5. distinguish from the target-level ones (s,h,o,c,E,B and C). including the state assertions P and Q,and our new rely/guarantee Below we use_->-for zero or multiple-step transitions with conditions R and G (let's first ignore the assertions p and g,which no events generated,+-for multiple-step transitions without will be explained in Sec.5). events,and+for multiple-step transitions with only one The state assertions P and Q relate the program states o and event e generated. at the target and source levels.They are separation logic assertions over a pair of states.We show their semantics in the top part of 3.2 Termination-Preserving Event Trace Refinement Fig.6.For simplicity,we assume the program variables used in Now we formally define the refinement relation that relates the target code are different from the ones in the source (e.g.,we the observable event traces generated by the source and the target use and X for target and source level variables respectively).B programs.A trace is a finite or infinite sequence of external events holds if it evaluates to true at the disjoint union of the target and the e,and may end with a termination marker or an abortion marker source stores s and s.We treat program variables as resources [15] It is co-inductively defined as follows. and use own()for the ownership of the program variable x. The assertion E E2 specifies a singleton heap of the targer (EvrTrace)::=Ie::&(co-inductive) level with E2 stored at the address E and requires that the stores We use ETr(C,o,E)to say that the trace is produced by contain variables used to evaluate E and E2.Its counterpart for executing C from the state o.It is co-inductively defined in Fig.4 source level heaps is represented as EE2,whose semantics is Here skip plays the role of a flag showing the end of execution (the defined similarly.emp describes empty stores and heaps at both normal termination).Unsafe executions lead to abort.We know levels.Semantics of separating conjunction P+Q is similar as if C diverges at o,then its trace is either of infinite length or in separation logic,except that it is now lifted to assertions over finite but does not end with or For instance,while(true)skip relational states (o,)The union of two disjoint relational states only produces an empty trace e,and while (true)[print(1)only (1,1)and (2,2)is defined in the middle part of Fig.6.We produces an infinite trace of output events. will define the assertion lpl in Sec.5(see Fig.8),which ignores Then we define a refinement (C,o)(C,)saying that ev- the additional information other than the relational states about p. ery event trace generated by (C,o)at the target level can be repro- Our new rely/guarantee assertions R and G specify the transi- duced by (C,>at the source.Since we could distinguish traces of tions over the relational states (o.)and also the effects on termi- diverging executions from those of terminating executions,the re- nation preservation.Their semantics is defined in the bottom part finement definition ensures that if(C,)diverges,so does (C,) of Fig.6.Here we use S for the relational states.A model con- Thus we know the target preserves the termination of the source. sists of the initial relational state S,the resulting state S',and an effect bit b to record whether the target transitions correspond to Definition 1 (Termination-Preserving Refinement). some source steps and can affect the termination preservation of (C,o)g(C,∑)iffE.ETr(C,o,E)=→ET(C,∑,E) the current thread (for R)or other threads(for G). We use [P]for identity transitions with the relational states RGSim-T:A Compositional Simulation with satisfying P.The action Px Q says that the initial relational states satisfy P and the resulting states satisfy Q.For these two kinds Termination Preservation of actions,we do not care whether there is any source step in the Below we propose RGSim-T,a new simulation as a compositional transition satisfying them(the effect bit b in their interpretations proof technique for the above termination-preserving refinement. could either be true or false).We also introduce a new action As we explained in Sec.2,the key to compositionality is to param- P O asserting that one or more steps are made at the source level eterize the simulation with the interferences between the programs (the effect bit b must be true).Following LRG [3],we introduce and their environments.In this paper,we specify the interferences separating conjunction over actions to locally reason about shared using rely/guarantee conditions [8],but extend them to also specify state updates.RI*R2 means that the actions Ri and R2 start from the effects on the termination preservation of individual threads. disjoint relational states and the resulting states are also disjoint Our simulation relation between C and C is in the form of But here we also require consistency over the effect bits for the two R,G,I (P)C3 C(Q).It takes R.G.I,P and Q as pa- disjoint state transitions.We use R for the transitive closure of rameters.R and G are rely and guarantee conditions specifying the R,where the effect bits in consecutive transitions are accumulated interference between the current thread and its environment.The The syntactic sugars ld,Emp and True represent arbitrary identity assertion I specifies the consistency relation between states at the transitions,empty transitions and arbitrary transitions respectively target and the source levels,which needs to be preserved during Since we logically split states into local and shared parts as in the execution.P specifies the pair of initial states at the target and LRG [3],we need a precise invariant I to fence actions over shared
(C, σ) −→+ abort ETr(C, σ, ) (C, σ) e −→+ (C′ , σ′ ) ETr(C′ , σ′ , E) ETr(C, σ, e ::E) (C, σ) −→∗ (skip, σ′ ) ETr(C, σ, ⇓) (C, σ) −→+ (C′ , σ′ ) ETr(C′ , σ′ , E) ETr(C, σ, E) Figure 4. Co-inductive definition of ETr(C, σ, E). atomic block hCi to execute a piece of code C atomically. Then the generic language in Fig. 3 is expressive enough for the source and the target programs which may have different granularities of state accesses. Due to the space limit, the operational semantics and more details about the language are formally presented in TR [13]. Conventions. We usually write blackboard bold or capital letters (s, h, Σ, , E, B and C) for the notations at the source level to distinguish from the target-level ones (s, h, σ, c, E, B and C). Below we use −→∗ for zero or multiple-step transitions with no events generated, −→+ for multiple-step transitions without events, and e −→ + for multiple-step transitions with only one event e generated. 3.2 Termination-Preserving Event Trace Refinement Now we formally define the refinement relation ⊑ that relates the observable event traces generated by the source and the target programs. A trace E is a finite or infinite sequence of external events e, and may end with a termination marker ⇓ or an abortion marker . It is co-inductively defined as follows. (EvtTrace) E ::= ⇓ | | ǫ | e ::E (co-inductive) We use ETr(C, σ, E) to say that the trace E is produced by executing C from the state σ. It is co-inductively defined in Fig. 4. Here skip plays the role of a flag showing the end of execution (the normal termination). Unsafe executions lead to abort. We know if C diverges at σ, then its trace E is either of infinite length or finite but does not end with ⇓ or . For instance, while (true) skip only produces an empty trace ǫ, and while (true) {print(1)} only produces an infinite trace of output events. Then we define a refinement (C, σ) ⊑ (C, Σ), saying that every event trace generated by (C, σ) at the target level can be reproduced by (C, Σ) at the source. Since we could distinguish traces of diverging executions from those of terminating executions, the re- finement definition ensures that if (C, σ) diverges, so does (C, Σ). Thus we know the target preserves the termination of the source. Definition 1 (Termination-Preserving Refinement). (C, σ) ⊑ (C, Σ) iff ∀E. ETr(C, σ, E) =⇒ ETr(C, Σ, E). 4. RGSim-T: A Compositional Simulation with Termination Preservation Below we propose RGSim-T, a new simulation as a compositional proof technique for the above termination-preserving refinement. As we explained in Sec. 2, the key to compositionality is to parameterize the simulation with the interferences between the programs and their environments. In this paper, we specify the interferences using rely/guarantee conditions [8], but extend them to also specify the effects on the termination preservation of individual threads. Our simulation relation between C and C is in the form of R, G, I |= {P}C C{Q}. It takes R, G, I, P and Q as parameters. R and G are rely and guarantee conditions specifying the interference between the current thread and its environment. The assertion I specifies the consistency relation between states at the target and the source levels, which needs to be preserved during the execution. P specifies the pair of initial states at the target and (RelAssn) P, Q, I ::= B | own(x) | emp | E 7→ E | E Z⇒ E | P ∗ Q | P ∨ Q | TpU | . . . (FullAssn) p, q ::= P | arem(C) | wf(E) | ⌊p⌋a | ⌊p⌋w | p ∗ q | p ∨ q | . . . (RelAct) R, G ::= [P] | P ⋉Q | P ∝ Q | R∗R | R+ | . . . Figure 5. Assertion language. the source levels from which the simulation holds, and Q is about the pair of final states when the target and the source terminate. So before we give our definition of RGSim-T, we first introduce our assertion language. 4.1 Assertions and New Rely/Guarantee Conditions We show the syntax of the basic assertion language in Fig. 5, including the state assertions P and Q, and our new rely/guarantee conditions R and G (let’s first ignore the assertions p and q, which will be explained in Sec. 5). The state assertions P and Q relate the program states σ and Σ at the target and source levels. They are separation logic assertions over a pair of states. We show their semantics in the top part of Fig. 6. For simplicity, we assume the program variables used in the target code are different from the ones in the source (e.g., we use x and X for target and source level variables respectively). B holds if it evaluates to true at the disjoint union of the target and the source stores s and s. We treat program variables as resources [15] and use own(x) for the ownership of the program variable x. The assertion E1 7→ E2 specifies a singleton heap of the target level with E2 stored at the address E1 and requires that the stores contain variables used to evaluate E1 and E2. Its counterpart for source level heaps is represented as E1 Z⇒ E2, whose semantics is defined similarly. emp describes empty stores and heaps at both levels. Semantics of separating conjunction P ∗ Q is similar as in separation logic, except that it is now lifted to assertions over relational states (σ, Σ). The union of two disjoint relational states (σ1, Σ1) and (σ2, Σ2) is defined in the middle part of Fig. 6. We will define the assertion TpU in Sec. 5 (see Fig. 8), which ignores the additional information other than the relational states about p. Our new rely/guarantee assertions R and G specify the transitions over the relational states (σ, Σ) and also the effects on termination preservation. Their semantics is defined in the bottom part of Fig. 6. Here we use S for the relational states. A model consists of the initial relational state S, the resulting state S ′ , and an effect bit b to record whether the target transitions correspond to some source steps and can affect the termination preservation of the current thread (for R) or other threads (for G). We use [P] for identity transitions with the relational states satisfying P. The action P ⋉ Q says that the initial relational states satisfy P and the resulting states satisfy Q. For these two kinds of actions, we do not care whether there is any source step in the transition satisfying them (the effect bit b in their interpretations could either be true or false). We also introduce a new action P ∝ Q asserting that one or more steps are made at the source level (the effect bit b must be true). Following LRG [3], we introduce separating conjunction over actions to locally reason about shared state updates. R1 ∗R2 means that the actions R1 and R2 start from disjoint relational states and the resulting states are also disjoint. But here we also require consistency over the effect bits for the two disjoint state transitions. We use R + for the transitive closure of R, where the effect bits in consecutive transitions are accumulated. The syntactic sugars Id, Emp and True represent arbitrary identity transitions, empty transitions and arbitrary transitions respectively. Since we logically split states into local and shared parts as in LRG [3], we need a precise invariant I to fence actions over shared
(s,h),(s,h)=B iff B]sw=true 2.for anye,C',o”,oF and oF,if(Ca世or)e(C',aand ((s,h),(sh))own()iff dom(ss)={} ∑l∑r,hen there exist o',M',Cand'such that (a)o”=σ'出oF, (s,h),(s,h)卡E1→E2ifh={[E1]sw=IE2lsw=} b)(C,Σ出F)S+(C,y'出F). ((s,h),(sh))emp iff s=h===0 (c)R,G,IF (C',o',M')(C,'),and fLf2 iff (dom(f1)ndom(f2)=0)ff2fuf2,if fLf2 (d)((a,)(a',>),true)G*True. 3.for any b.a'and 'if ((a )('),b)R*ld,then (s1,h1)L(s2,h2)iff(s1⊥s2)A(h1⊥h2) there exists M'such that (s1,h1)(s2,h2)(s1Us2,hiUh2),if (s1:h1)L(s2,h2) (a)R,G,IF(C,o',M')Q(C,').and (a1,1)u(a2,∑2)兰(1w2,B1y22),ifa1⊥2and1⊥2 (b)if b false,we need M'=M. 4.ifC=skip,then for any∑such that∑⊥∑F,there exist n S=(a,) and X'such that (S,S',6)[P]iff (S P)A(S=S') (a)(C,世∑r)→"(skip,'世F), (S,S,b)上P×Qif(S上P)A(S上Q) b)(o,∑)=Q, (S,S',b)乍PQiff(SFP)A(S'乍Q)A(b=true) (c)if n >0,then ((,)(a,')true)G++True. 5.for any og and∑F,if(C,oyoF)→abort and∑⊥∑F,then (S,S,b)片R1*R2f3S1,S2,S1,S%·(S=S1出S2)A (C,∑世∑F)→+abort. (S=S1世S%)A(S1,S1,b)=R1)∧(S2,S,b)=R2} (S,S',b)R+iff ((S,S,b)R)v (3s",U.8. (S,S,)=)A(S,S',b)=R+)A(6=HV") The simulation R.G,I=(C,o,M)o (C,>relates the ld兰[true]Emp兰emp x emp True些true xtrue executions of the target configuration (C,)(with its metric M) to the source (C.)under the interferences with the environment I R iff([I]→R)A(R→IKI)A Precise(I) specified by R and G.It first requires that the relational state (o, Sta(P,R)ifS,S,b.(SFP)A(S,S',b)上R)→(S=P) satisfy I*true,I for the shared part and true for the local part, establishing a consistency relation between the states at the two Figure 6.Semantics of assertions (part D) levels.For every silent step of (C,a)(condition 1,let's first ignore the frame states op and which will be discussed later),the source could make n steps (n>0)correspondingly (1(b)),and the states,which is a state assertion like P and Q.We define the fence simulation is preserved afterwards with a new metric M(1(c)). IDR in a similar way as in our previous work [10]and LRG [3] Here we use--"-to represent n-step silent transitions.If which says that I precisely determines the boundaries of the states n=0 in 1(b)(i.e..the source does not move).the metric must of the transitions in R(see Fig.6).The formal definition of the decrease along the associated well-founded order (M'F at the target and source levels to repre- 1.for any C,",oF and∑F,if(C,o出oF)-→(C',o")and sent the remaining parts of the states owned by other threads in the ∑L∑F,then there exist o',n,M',b,Cand∑'such that system.The commands C and C must not change the frame states (a)g"=o'世oF, during their executions (see,e.g.,conditions 1(a)and 1(b)).These (b)(C,∑世∑F)→n(C','∑F), oF and >F quantifications in RGSim-T are crucial to admit the (c)R,G,I=(C,o,M')so(C',). parallel compositionality and the frame rules (the B-FRAME rule in Fig.7 and the FRAME rule in Fig.9). (d)((o,>)(a',>),b)G*True,and We then define R,G,I =PC<CIQ by hiding the initial (e)if n=0,we need M'<M and b=false,otherwise b=true. states via the precondition P and hiding the metric M
((s, h), (s, h)) |= B iff JBKs⊎s = true ((s, h), (s, h)) |= own(x) iff dom(s ⊎ s) = {x} ((s, h), (s, h)) |= E1 7→ E2 iff h = {JE1Ks⊎s ❀ JE2Ks⊎s} ((s, h), (s, h)) |= emp iff s = h = s = h = ∅ f1⊥f2 iff (dom(f1)∩dom(f2)=∅) f1⊎f2 def = f1∪f2 , if f1⊥f2 (s1, h1)⊥(s2, h2) iff (s1⊥s2) ∧ (h1⊥h2) (s1, h1) ⊎ (s2, h2) def = (s1 ∪ s2, h1 ∪ h2) , if (s1, h1)⊥(s2, h2) (σ1, Σ1) ⊎ (σ2, Σ2) def = (σ1 ⊎ σ2, Σ1 ⊎ Σ2) , if σ1⊥σ2 and Σ1⊥Σ2 S ::= (σ, Σ) (S, S ′ , b) |= [P] iff (S |= P) ∧ (S = S ′ ) (S, S ′ , b) |= P ⋉ Q iff (S |= P) ∧ (S ′ |= Q) (S, S ′ , b) |= P ∝ Q iff (S |= P) ∧ (S ′ |= Q) ∧ (b=true) (S, S ′ , b) |= R1 ∗ R2 iff ∃S1, S2, S ′ 1 , S ′ 2 . (S = S1 ⊎ S2) ∧ (S ′ = S ′ 1 ⊎ S′ 2 ) ∧ ((S1, S ′ 1 , b) |= R1) ∧ ((S2, S ′ 2 , b) |= R2) (S, S ′ , b) |= R+ iff ((S, S ′ , b) |= R) ∨ (∃S′′, b′ , b′′ . ((S, S ′′, b′ ) |= R) ∧ ((S ′′ , S ′ , b′′) |= R+) ∧ (b = b ′ ∨ b ′′)) Id def = [true] Emp def = emp ⋉ emp True def = true ⋉ true I ⊲ R iff ([I] ⇒ R) ∧ (R ⇒ I ⋉ I) ∧ Precise(I) Sta(P, R) iff ∀S, S ′ , b. (S |=P) ∧ ((S, S ′ , b) |=R) =⇒ (S ′ |=P) Figure 6. Semantics of assertions (part I). states, which is a state assertion like P and Q. We define the fence I ⊲ R in a similar way as in our previous work [10] and LRG [3], which says that I precisely determines the boundaries of the states of the transitions in R (see Fig. 6). The formal definition of the precise requirement Precise(I) is given in TR [13], which follows its usual meaning as in separation logic but is now interpreted over relational states. 4.2 Definition of RGSim-T Our simulation RGSim-T is parameterized over the rely/guarantee conditions R and G to specify the interferences between threads and their environments, and a precise invariant I to logically determine the boundaries of the shared states and fence R and G. The simulation also takes a metric M, which was referred to as |T | in our previous informal explanations in Sec. 2. We leave its type unspecified here, which can be instantiated by program verifiers, as long as it is equipped with a well-founded order 0, then ((σ, Σ),(σ, Σ ′ ), true) |= G + ∗ True. 5. for any σF and ΣF , if (C, σ⊎σF ) −→ abort and Σ⊥ΣF , then (C, Σ ⊎ ΣF ) −→+ abort. The simulation R, G, I |= (C, σ, M) Q (C, Σ) relates the executions of the target configuration (C, σ) (with its metric M) to the source (C, Σ), under the interferences with the environment specified by R and G. It first requires that the relational state (σ, Σ) satisfy I ∗ true, I for the shared part and true for the local part, establishing a consistency relation between the states at the two levels. For every silent step of (C, σ) (condition 1, let’s first ignore the frame states σF and ΣF which will be discussed later), the source could make n steps (n ≥ 0) correspondingly (1(b)), and the simulation is preserved afterwards with a new metric M′ (1(c)). Here we use −→ n to represent n-step silent transitions. If n = 0 in 1(b) (i.e., the source does not move), the metric must decrease along the associated well-founded order (M′ < M in 1(e)), otherwise we do not have any restrictions over M′ . We also require that the related steps at the two levels satisfy the guarantee condition G + ∗True (1(d)), the transitive closure G + for the shared part and True for the private. If the target step corresponds to no source moves (n = 0), we use false as the corresponding effect bit, otherwise the bit should be true (1(e)). If a target step produces an event e, the requirements (condition 2) are similar to those in condition 1, except that we know for sure that target step corresponds to one or more source steps that produce the same e. The simulation should be preserved after environment transitions satisfying R + ∗ Id, R + for the shared part and Id for the private (condition 3). If the corresponding effect bit of the environment transition is true, we know there are one or more source moves, therefore there are no restrictions over the metric M′ for the resulting code (which could be larger than M). Otherwise, the metric should be unaffected under the environment interference (i.e., M′ = M in 3(b)). If C terminates (condition 4), the corresponding C must also terminate and the resulting states satisfy the postcondition Q. Finally, if C is unsafe, then C must be unsafe too (condition 5). Inspired by Vafeiadis [24], we directly embed the framing aspect of separation logic in Def. 2. At each condition, we introduce the frame states σF and ΣF at the target and source levels to represent the remaining parts of the states owned by other threads in the system. The commands C and C must not change the frame states during their executions (see, e.g., conditions 1(a) and 1(b)). These σF and ΣF quantifications in RGSim-T are crucial to admit the parallel compositionality and the frame rules (the B-FRAME rule in Fig. 7 and the FRAME rule in Fig. 9). We then define R, G, I |= {P}C C{Q} by hiding the initial states via the precondition P and hiding the metric M
RV G2,G1,H[P*P)C13C1{Q1*Q}RVG1,G2,IH{P2+P)C23C2[Q2+Q2}PVQIVQ2I IDR R,GIVG2,IH(P+P2*P)C1lC23C1C2{Q1+Q2*(Q1AQ2)} (B-PAR) P÷(B÷B)*IR,G,IF{PAB}C)respectively (the formal definition elided here).How- Termination-Preserving Refinement ever,it is worth noting the definition of disjoint union over the quadruple states,which is shown in the middle part of Fig.8.The The binary inference rules in Fig.7 allow us to decompose the disjoint union of the numbers of tokens w and w2 is simply the refinement verification of large programs into the refinement units' sum of them.The disjoint union of Di and D2 is defined only if
R ∨ G2, G1, I ⊢ {P1 ∗ P}C1 C1{Q1 ∗ Q′ 1 } R ∨ G1, G2, I ⊢ {P2 ∗ P}C2 C2{Q2 ∗ Q′ 2 } P ∨Q′ 1∨Q′ 2 ⇒ I I ⊲ R R, G1 ∨ G2, I ⊢ {P1 ∗ P2 ∗ P}C1 kC2 C1 kC2{Q1 ∗ Q2 ∗ (Q′ 1 ∧ Q′ 2 )} (B-PAR) P ⇒ (B ⇔ B) ∗ I R, G, I ⊢ {P ∧ B}C C{P} R, G, I ⊢ {P}while (B) C while (B) C{P ∧ ¬B} (B-WHILE) P ⇒ (E = E) ∗ I Sta(P, R ∗ Id) I ⊲ {R, G} R, G, I ⊢ {P}print(E)print(E){P} (B-PRT) R, G, I ⊢ {P}C C{Q} Sta(P ′ , R′ ∗ Id) I ′ ⊲ {R′ , G′} P ′ ⇒ I ′ ∗ true G+ ⇒ G R ∗ R′ , G ∗ G′ , I ∗ I ′ ⊢ {P ∗ P ′}C C{Q ∗ P ′} (B-FRAME) Figure 7. Selected binary inference rules (compositionality of RGSim-T). Adequacy. RGSim-T ensures the termination-preserving refinement by using the metric with a well-founded order. The proof of the following adequacy theorem is in TR [13]. Theorem 3 (Adequacy of RGSim-T). If there exist R, G, I, Q and a metric M (with a well-founded order <) such that R, G, I |= (C, σ, M)Q (C, Σ), then (C, σ) ⊑ (C, Σ). 4.3 Compositionality Rules RGSim-T is compositional. We show some of the compositionality rules in Fig. 7. Here we use R, G, I ⊢ {P}C C{Q} for the judgment to emphasize syntactic reasoning, whose semantics is RGSim-T (Def. 2). The rules can be viewed as the binary version of those in a traditional rely-guarantee-style logic (e.g., LRG [3] and RGSep [23]). The B-PAR rule shows the compositionality w.r.t. parallel compositions. To verify C1 k C2 is a refinement of C1 k C2, we verify the refinement of each thread separately. The rely condition of each thread captures the interference from both the overall environment (R) and its sibling thread (G1 or G2). The related steps of C1 k C2 and C1 k C2 should satisfy either thread’s guarantee. As in LRG [3], P1 and P2 specify the private (relational) states of the threads C1/C1 and C2/C2 respectively. The states P are shared by them. When both threads have terminated, their private states satisfy Q1 and Q2, and the shared states satisfy both Q ′ 1 and Q ′ 2. We require that the shared states are well-formed (P, Q ′ 1 and Q ′ 2 imply I) and the overall environment transitions are fenced (I ⊲ R). The B-WHILE rule requires the boolean conditions of both sides to be evaluated to the same value. The resources needed to evaluate them should be available in the private part of P. The B-FRAME rule supports local reasoning. The frame P ′ may contain shared and private parts, so it should be stable w.r.t. R ′ ∗ Id and imply I ′ ∗ true, where I ′ is the fence for R ′ and G ′ (see Fig. 6 for the definitions of fences and stability). We also require G to be closed over transitivity. This rule is almost identical to the one in LRG [3]. Details are elided here. We provide a few binary rules to reason about the basic program units when they are almost identical at both sides. For instance, the B-PRT rule relates a target print command to a source one, requiring that they always print out the same value. For more general refinement units, as we explained in Sec. 2, we reduce relational verification to unary reasoning (using the U2B rule in Fig. 9, which we will explain in the next section). Our TR [13] contains more rules and the full soundness proofs. The soundness theorem is shown below. Theorem 4 (Soundness of Binary Rules). If R, G, I ⊢ {P}C C{Q}, then R, G, I |={P}C C{Q}. 5. A Rely-Guarantee-Style Logic for Termination-Preserving Refinement The binary inference rules in Fig. 7 allow us to decompose the refinement verification of large programs into the refinement units’ w ∈ Nat D ::= C | • (σ, w, D, Σ) |= P iff (σ, Σ) |= P (σ, w, D, Σ) |= arem(C ′ ) iff D = C ′ ((s, h), w, D, Σ) |= wf(E) iff ∃n. (JEKs = n) ∧ (n ≤ w) (σ, w, D, Σ) |= ⌊p⌋a iff ∃D ′ . (σ, w, D ′ , Σ) |= p (σ, w, D, Σ) |= ⌊p⌋w iff ∃w′ . (σ, w′ , D, Σ) |= p (σ, Σ) |= TpU iff ∃w, D. (σ, w, D, Σ) |= p D1⊥D2 iff (D1 = •) ∨ (D2 = •) D1 ⊎ D2 def = D2 if D1 = • D1 if D2 = • (σ1, w1, D1, Σ1) ⊎ (σ2, w2, D2, Σ2) def = (σ1⊎σ2, w1+w2, D1⊎D2, Σ1⊎Σ2) , if σ1⊥σ2, D1⊥D2 and Σ1⊥Σ2 Sta(p, R) iff ∀σ, w, D, Σ, σ′ , Σ′ , b. ((σ, w, D, Σ) |= p) ∧ (((σ, Σ), (σ ′ , Σ′ ), b) |= R) =⇒ ∃w′ . (σ ′ , w′ , D, Σ′ ) |= p ∧ (b = false =⇒ w′ = w) p ⇛0 q iff p ⇒ q p ⇛+ q iff ∀σ, w, D, Σ, ΣF . ((σ, w, D, Σ) |= p) ∧ (Σ⊥ΣF ) =⇒ ∃w′ ,C′ ,Σ′ . (D, Σ⊎ΣF ) −→+ (C′ , Σ′⊎ΣF ) ∧ ((σ, w′ , C′ , Σ′ ) |= q) Figure 8. Semantics of assertions (part II). verification. In this section, we explain the unary rules for verifying refinement units. All the binary and unary rules constitute our novel rely-guarantee-style logic for modular verification of terminationpreserving refinement. 5.1 Assertions on Source Code and Number of Tokens We first explain the new assertions p and q used in the unary rules that can specify the source code and metrics in addition to states. We define their syntax in Fig. 5, and their semantics in Fig. 8. A full state assertion p is interpreted on (σ, w, D, Σ). Here besides the states σ and Σ at the target and source levels, we introduce some auxiliary data w and D. w is the number of tokens needed for loops (see Sec. 2). D is either some source code C, or a special sign • serving as a unit for defining semantics of p ∗ q below. In Fig. 8 we lift the relational assertion P as a full state assertion to specify the states. The new assertion arem(C) says that the remaining source code is C at the current program point. wf(E) states that the number of tokens at the current target code is no less than E. We can see wf(0) always holds, and for any n, wf(n + 1) implies wf(n). We use ⌊p⌋a and ⌊p⌋w to ignore the descriptions in p about the source code and the number of tokens respectively. TpU lifts p back to a relational state assertion. Separating conjunction p ∗ q has the standard meaning as in separation logic, which says p and q hold over disjoint parts of (σ, w, D, Σ) respectively (the formal definition elided here). However, it is worth noting the definition of disjoint union over the quadruple states, which is shown in the middle part of Fig. 8. The disjoint union of the numbers of tokens w1 and w2 is simply the sum of them. The disjoint union of D1 and D2 is defined only if
R,G,I{P A arem(C)}C{Q A arem(skip)} (U2B) R,G,I上{P}C0A wf(i)arem(skip) 2 while (true){ 2 while (i>0){ {x=xA arem(S)} if (x =t) 3 t :x; x=X=t A arem(S) (i>0A wf(1-1)A arem(skip)} 3 x=X=t A arem(S')V x=x=tAarem(X+;S)】 {i0A wf(i)A arem(skip)} x=tA arem(S)A wf(1) x:=t+1; 4 cas(&x,t,t+1); x=X=t+1A arem(S)A wf(1) x=X A arem(S)A wf(1) (b)local termination: (a)looping a counter:I (x=X) R=G兽(IaxI)V☑ I些empR=G些Emp Figure 10.Proofs for two small examples. at least one of them is the special sign.,which has no knowledge The U2B rule,as explained in Sec.2.turns unary proofs to about the remaining source code C.Therefore we know the follow- binary ones.It says that if the remaining source code is C at the ing holds (for any P and C): beginning of the target C.and it becomes skip at the end of C. (P A arem(C)A wf(1))*(wf(1)A emp)(P A arem(C)A wf(2)) then we know C is simulated by C. The ATOM rule allows us to reason sequentially about the target One may think a more natural definition of the disjoint union is code in the atomic block.We use Hs [p]C[g]to represent the total to require the two Ds be the same.But this would break the FRAME correctness of C in sequential separation logic.The corresponding rule (see Fig.9).For example,we can prove: rules are mostly standard and elided here.Note that C only accesses Emp,Emp,emp{=XA arem(X++)x++{x=X A arem(skip) the target state o,therefore in our sequential rules we require the source state and the auxiliary data w and D in p should With the FRAME rule and the separating conjunction based on remain unchanged in g.We can lift C's total correctness to the the alternative definition of disjoint union,we would derive the concurrent setting as long as its overall transition over the shared following: states satisfies the guarantee G.Here we assume the environment Emp,Emp,emp {(x=XA arem(X++))arem(X++)} is identity transitions.To allow general environment behaviors,we x++(=X A arem(skip))arem(++)} can apply the ATOM-R rule later,which requires that R be fenced which is reduced to an invalid judgment: by I and the pre-and post-conditions be stable w.r.t.R. The ATOM rule is similar to the ATOM rule,except that it Emp,Emp,emp {x =X A arem(X++)}x++{false executes the source code simultaneously with the target atomic We require in p*g that either p or g should not specify the source step.We use pg for the multi-step executions from the source code,therefore in this example the precondition after applying the code specified by p to the code specified by g,which is defined frame rule is invalid (thus the whole judgment is valid). in the bottom part of Fig.8.We also write pg for the usual The stability of p w.r.t.an action R,defined at the bottom part of implication p=g.Then,the AToM rule says,we can execute the Fig.8.specifies how the number of tokens of a program(specified source code before or after the steps of C,as long as the overall by p)could change under R's interferences.As a simple example transition (including the source steps and the target steps)with the for the following p,Ri and R2,Sta(p,R1)holds while Sta(p,R2) effect bit true satisfies G for the shared parts. does not hold: The WHILE rule is the key to proving the preservation of termi- p些(10→0*20台0)V(10→1*20片0)Awf(1) nation.As we informally explained in Sec.2,we should be able to decrease the number of tokens at the beginning of each loop itera- R1兽(10→0*20片0)x(10→1*20片0) tion.And we should re-establish the invariant p between the states R2兰(10→0*20÷0)×(10→1*20÷0) and the number of tokens at the end of each iteration.Below we 5.2 Unary Inference Rules give two examples,each of which shows a typical application of the WHILE rule. The judgment for unary reasoning is in the form of R,G,I pCg.We present some of the rules in Fig.9
R, G, I ⊢ {P ∧ arem(C)}C{Q ∧ arem(skip)} R, G, I ⊢ {P}C C{Q} (U2B) ⊢SL [p]C[q] (TpU ⋉ TqU) ⇒ G ∗ True I ⊲ G p ∨ q ⇒ I ∗ true [I], G, I ⊢ {p}hCi{q} (ATOM) p ⇛a p ′ ⊢SL [p ′ ]C[q ′ ] q ′ ⇛b q + ∈ {a, b} (TpU ∝ TqU) ⇒ G ∗ True I ⊲ G p ∨ q ⇒ I ∗ true [I], G, I ⊢ {p}hCi{q} (ATOM+) [I], G, I ⊢ {p}hCi{q} Sta({p, q}, R ∗ Id) I ⊲ R R, G, I ⊢ {p}hCi{q} (ATOM-R) p ⇒ (B = B) ∗ I p ∧ B ⇒ p ′ ∗ (wf(1) ∧ emp) R, G, I ⊢ {p ′}C{p} R, G, I ⊢ {p}while (B) C{p ∧ ¬B} (WHILE) R, G, I ⊢ {p}C{q} R, G, I ⊢ {⌊p⌋w}C{⌊q⌋w} (HIDE-W) R, G, I ⊢ {p}C{q} Sta(p ′ , R′ ∗ Id) I ′ ⊲ {R′ , G′} p ′ ⇒ I ′ ∗ true G+ ⇒ G R ∗ R′ , G ∗ G′ , I ∗ I ′ ⊢ {p ∗ p ′}C{q ∗ p ′} (FRAME) Figure 9. Selected unary inference rules. 1 local t; x = X ∧ arem(S ′ ) ∧ wf(1) 2 while (true) { x = X ∧ arem(S ′ ) 3 x = X = t ∧ arem(S ′ ) ∨ x = X 6= t ∧ arem(S ′ ) ∧ wf(1) 4 cas(&x, t, t+1); x = X ∧ arem(S ′ ) ∧ wf(1) 5 } // unfolding cas (a) looping a counter: I def = (x = X) R = G def = (I ∝ I) ∨ [I] 1 local i := 100; i ≥ 0 ∧ wf(i) ∧ arem(skip) 2 while (i > 0) { i > 0 ∧ wf(i-1) ∧ arem(skip) 3 i--; i ≥ 0 ∧ wf(i) ∧ arem(skip) 4 } (b) local termination: I def = emp R = G def = Emp Figure 10. Proofs for two small examples. at least one of them is the special sign •, which has no knowledge about the remaining source code C. Therefore we know the following holds (for any P and C): (P ∧ arem(C) ∧ wf(1)) ∗ (wf(1) ∧ emp) ⇔ (P ∧ arem(C) ∧ wf(2)) One may think a more natural definition of the disjoint union is to require the two Ds be the same. But this would break the FRAME rule (see Fig. 9). For example, we can prove: Emp, Emp, emp ⊢ {x = X ∧ arem(X++)} x++ {x = X ∧ arem(skip)} With the FRAME rule and the separating conjunction based on the alternative definition of disjoint union, we would derive the following: Emp, Emp, emp ⊢ {(x = X ∧ arem(X++)) ∗ arem(X++)} x++ {(x = X ∧ arem(skip)) ∗ arem(X++)} which is reduced to an invalid judgment: Emp, Emp, emp ⊢ {x = X ∧ arem(X++)} x++ {false} We require in p ∗ q that either p or q should not specify the source code, therefore in this example the precondition after applying the frame rule is invalid (thus the whole judgment is valid). The stability of p w.r.t. an action R, defined at the bottom part of Fig. 8, specifies how the number of tokens of a program (specified by p) could change under R’s interferences. As a simple example, for the following p, R1 and R2, Sta(p, R1) holds while Sta(p, R2) does not hold: p def = (10 7→ 0 ∗ 20 Z⇒ 0) ∨ ((10 7→ 1 ∗ 20 Z⇒ 0) ∧ wf(1)) R1 def = (10 7→ 0 ∗ 20 Z⇒ 0) ∝ (10 7→ 1 ∗ 20 Z⇒ 0) R2 def = (10 7→ 0 ∗ 20 Z⇒ 0) ⋉ (10 7→ 1 ∗ 20 Z⇒ 0) 5.2 Unary Inference Rules The judgment for unary reasoning is in the form of R, G, I ⊢ {p}C{q}. We present some of the rules in Fig. 9. The U2B rule, as explained in Sec. 2, turns unary proofs to binary ones. It says that if the remaining source code is C at the beginning of the target C, and it becomes skip at the end of C, then we know C is simulated by C. The ATOM rule allows us to reason sequentially about the target code in the atomic block. We use ⊢SL [p]C[q] to represent the total correctness of C in sequential separation logic. The corresponding rules are mostly standard and elided here. Note that C only accesses the target state σ, therefore in our sequential rules we require the source state Σ and the auxiliary data w and D in p should remain unchanged in q. We can lift C’s total correctness to the concurrent setting as long as its overall transition over the shared states satisfies the guarantee G. Here we assume the environment is identity transitions. To allow general environment behaviors, we can apply the ATOM-R rule later, which requires that R be fenced by I and the pre- and post-conditions be stable w.r.t. R. The ATOM+ rule is similar to the ATOM rule, except that it executes the source code simultaneously with the target atomic step. We use p ⇛+ q for the multi-step executions from the source code specified by p to the code specified by q, which is defined in the bottom part of Fig. 8. We also write p ⇛0 q for the usual implication p ⇒ q. Then, the ATOM+ rule says, we can execute the source code before or after the steps of C, as long as the overall transition (including the source steps and the target steps) with the effect bit true satisfies G for the shared parts. The WHILE rule is the key to proving the preservation of termination. As we informally explained in Sec. 2, we should be able to decrease the number of tokens at the beginning of each loop iteration. And we should re-establish the invariant p between the states and the number of tokens at the end of each iteration. Below we give two examples, each of which shows a typical application of the WHILE rule
Examples.The first example is the T"and S'in Sec.2.We show Counter and its variants its proof in our logic in Fig.10(a)(for simplicity,below we always Linearizability Lock-Freedom Treiber stack [20] assume the ownership of variables).We use X for the counter at Michael-Scott lock-free queue [14] the source,and the rely/guarantee conditions say that the counters DGLM lock-free queue [2] Non-Atomic Object Correctness Synchronous queue [16] at the two levels can be updated simultaneously with the effect bit Correctness of Optimized Algo Counter vs.its variants true.The loop invariant above line 2 says that we should have at (Equivalence) TAS lock vs.TTAS lock6可 least one token to execute the loop.The loop body is verified with zero tokens,and should finally restore the invariant token number Figure 11.Verified examples using our logic. 1.The gaining of the token may be due to a successful cas at line 4 that corresponds to source steps,or caused by the environment interferences.More specifically,the assertion following line 3 says Proving linearizability and lock-freedom together for concurrent that we can gain a token if the counters have been updated.If the objects.It has been shown [12]that the verification of lineariz- counters are not updated before the cas at line 4,the cas succeeds ability and lock-freedom together can be reduced to verifying a and we show the detailed proof at the right part of Fig.10(a),in contextual refinement that preserves the termination of any client which we execute one iteration of the source code and gain a token programs.That is,for any client as the context,the termination- (applying the ATOM rule). preserving refinement CC should hold.Here we use C This example shows the most straightforward understanding of for the concrete implementation of the object,and C for the corre- the WHILE rule:we pay a token at the beginning of an iteration sponding abstract atomic operations.C(or C)denotes the and should be able to gain another token during the execution of whole program where the client accesses the object via method the iteration.The next example is more subtle (though simpler). calls to C (or C). As shown in Fig.10(b),it is a locally-terminating while loop (i.e. The compositionality rules of our logic (Fig.7)allow us to ver- a loop that terminates regardless of environment interferences) ify the above contextual refinement by proving R,G,I[P)C We prove it refines skip under the environment Emp.The loop C(Q).Then we apply the U2B rule and turn the relational ver- invariant above line 2 says that the number of tokens equals the ification to unary reasoning.As in a normal linearizability proof value of i.If the loop condition (i>o)is satisfied,we pay one (e.g.,[10,23]),we need to find a single step of C(i.e.,the lin- token.In the proof of the loop body,we do not (and are not able earization point)that corresponds to the atomic step of C.Here we to)gain more tokens.Instead,the value of i will be decreased in also have to prove lock-freedom:the failure to make progress (i.e., the iteration,enabling us to restore the equality between the number finish an abstract operation)of a thread must be caused by success- of tokens and i. ful progress of its environment,which can be ensured by the WHILE rule (in Fig.9)in our logic. Other rules and discussions.Another important rule is the We have used the above approach to verify several lineariz- HIDE-W rule in Fig.9.It shows that tokens are just an auxiliary able and lock-free objects,including Treiber stack [20].Michael- tool,which could be safely discarded (by using_w)when the Scott lock-free queue [14]and DGLM queue [2].We can further termination-preservation of a command C(say,a while loop)is extend the logic in this paper with the techniques [10]for verify- already established.As we mentioned in Sec.2.the HIDE-W rule ing linearizability of algorithms with non-fixed linearization points, is crucial to handle infinite nondeterminism.It is also important for to support more sophisticated examples such as HSY elimination- local reasoning,so that when we verify a thread,we do not have based stack and Harris-Michael lock-free list. to calculate and specify in the precondition the number of tokens needed by all the while loops.For nested loops,we could use the Verifying concurrent objects whose abstract operations are not HIDE-W rule to hide the tokens needed by the inner loop,and use atomic.Sometimes we cannot define single atomic operations as the FRAME rule to add back the tokens needed for the outer loop the abstract specification of a concurrent object.For objects that later when we compose the inner loop with other parts of the outer implement synchronization between threads.we may have to ex- loop body plicitly take into account the interferences from other threads when The unary FRAME rule in Fig.9 is similar to the binary one in defining the abstract behaviors of the current thread.For exam- Fig.7.Other rules can be found in our TR [13],which are very ple,the synchronous queue [16]is a concurrent transfer channel in similar to those in LRG [3],but we give different interpretations to which each producer presenting an item must wait for a consumer assertions and actions. to take this item,and vice versa.The corresponding abstract opera- The binary rules(in Fig.7)and the unary rules(in Fig.9)gives tions are no longer atomic.We used our logic to prove the contex- us a full proof theory for termination-preserving refinement.We tual refinement between the concrete implementation (from [16] want to remind the readers that the logic does not ensure termina- used in Java 6)and a more abstract synchronous queue.The refine- tion of programs,therefore it is not a logic for total correctness.On ment ensures that if a producer (or a consumer)is blocked at the the other hand,if we restrict the source code to skip (which always concrete level,it must also be blocked at the source level. terminates),then our unary rules can be viewed as a proof theory for the total correctness of concurrent programs. Proving equivalence between optimized algorithms and original ones. We also use our logic to show variants of concurrent algo- Also note that the use of a natural number w as the while- specific metric is to simplify the presentation only.It is easy to rithms are correct optimizations of the original implementations extend our work to support other types of the while-specific metrics In this case,we show equivalence (in fact,contextual equivalence) i.e..refinements of both directions. for more complicated examples. For instance,we proved the TTAS lock implementation is equivalent to the TAS lock implementation [6]for any client using 6.More Examples the locks.The former tests the lock bit in a nested while loop until it appears to be free,and then uses the atomic getAndSet instruction We have seen a few small examples that illustrate the use of our to update the bit;while the latter directly tries getAndSet until logic,in particular,the WHILE rule.In this section,we discuss other success.The equivalence result between these two lock implemen examples that we have proved,which are summarized in Fig.11 tations shows that no client may observe their differences,includ- Their proofs are in TR [13]. ing the differences on their termination behaviors (e.g.,whether a
Examples. The first example is the T ′′ c and S ′ in Sec. 2. We show its proof in our logic in Fig. 10(a) (for simplicity, below we always assume the ownership of variables). We use X for the counter at the source, and the rely/guarantee conditions say that the counters at the two levels can be updated simultaneously with the effect bit true. The loop invariant above line 2 says that we should have at least one token to execute the loop. The loop body is verified with zero tokens, and should finally restore the invariant token number 1. The gaining of the token may be due to a successful cas at line 4 that corresponds to source steps, or caused by the environment interferences. More specifically, the assertion following line 3 says that we can gain a token if the counters have been updated. If the counters are not updated before the cas at line 4, the cas succeeds and we show the detailed proof at the right part of Fig. 10(a), in which we execute one iteration of the source code and gain a token (applying the ATOM+ rule). This example shows the most straightforward understanding of the WHILE rule: we pay a token at the beginning of an iteration and should be able to gain another token during the execution of the iteration. The next example is more subtle (though simpler). As shown in Fig. 10(b), it is a locally-terminating while loop (i.e., a loop that terminates regardless of environment interferences). We prove it refines skip under the environment Emp. The loop invariant above line 2 says that the number of tokens equals the value of i. If the loop condition (i>0) is satisfied, we pay one token. In the proof of the loop body, we do not (and are not able to) gain more tokens. Instead, the value of i will be decreased in the iteration, enabling us to restore the equality between the number of tokens and i. Other rules and discussions. Another important rule is the HIDE-W rule in Fig. 9. It shows that tokens are just an auxiliary tool, which could be safely discarded (by using ⌊ ⌋w) when the termination-preservation of a command C (say, a while loop) is already established. As we mentioned in Sec. 2, the HIDE-W rule is crucial to handle infinite nondeterminism. It is also important for local reasoning, so that when we verify a thread, we do not have to calculate and specify in the precondition the number of tokens needed by all the while loops. For nested loops, we could use the HIDE-W rule to hide the tokens needed by the inner loop, and use the FRAME rule to add back the tokens needed for the outer loop later when we compose the inner loop with other parts of the outer loop body. The unary FRAME rule in Fig. 9 is similar to the binary one in Fig. 7. Other rules can be found in our TR [13], which are very similar to those in LRG [3], but we give different interpretations to assertions and actions. The binary rules (in Fig. 7) and the unary rules (in Fig. 9) gives us a full proof theory for termination-preserving refinement. We want to remind the readers that the logic does not ensure termination of programs, therefore it is not a logic for total correctness. On the other hand, if we restrict the source code to skip (which always terminates), then our unary rules can be viewed as a proof theory for the total correctness of concurrent programs. Also note that the use of a natural number w as the whilespecific metric is to simplify the presentation only. It is easy to extend our work to support other types of the while-specific metrics for more complicated examples. 6. More Examples We have seen a few small examples that illustrate the use of our logic, in particular, the WHILE rule. In this section, we discuss other examples that we have proved, which are summarized in Fig. 11. Their proofs are in TR [13]. Linearizability & Lock-Freedom Counter and its variants Treiber stack [20] Michael-Scott lock-free queue [14] DGLM lock-free queue [2] Non-Atomic Object Correctness Synchronous queue [16] Correctness of Optimized Algo Counter vs. its variants (Equivalence) TAS lock vs. TTAS lock [6] Figure 11. Verified examples using our logic. Proving linearizability and lock-freedom together for concurrent objects. It has been shown [12] that the verification of linearizability and lock-freedom together can be reduced to verifying a contextual refinement that preserves the termination of any client programs. That is, for any client as the context C, the terminationpreserving refinement C[C] ⊑ C[C] should hold. Here we use C for the concrete implementation of the object, and C for the corresponding abstract atomic operations. C[C] (or C[C]) denotes the whole program where the client accesses the object via method calls to C (or C). The compositionality rules of our logic (Fig. 7) allow us to verify the above contextual refinement by proving R, G, I ⊢ {P}C C{Q}. Then we apply the U2B rule and turn the relational verification to unary reasoning. As in a normal linearizability proof (e.g., [10, 23]), we need to find a single step of C (i.e., the linearization point) that corresponds to the atomic step of C. Here we also have to prove lock-freedom: the failure to make progress (i.e., finish an abstract operation) of a thread must be caused by successful progress of its environment, which can be ensured by the WHILE rule (in Fig. 9) in our logic. We have used the above approach to verify several linearizable and lock-free objects, including Treiber stack [20], MichaelScott lock-free queue [14] and DGLM queue [2]. We can further extend the logic in this paper with the techniques [10] for verifying linearizability of algorithms with non-fixed linearization points, to support more sophisticated examples such as HSY eliminationbased stack and Harris-Michael lock-free list. Verifying concurrent objects whose abstract operations are not atomic. Sometimes we cannot define single atomic operations as the abstract specification of a concurrent object. For objects that implement synchronization between threads, we may have to explicitly take into account the interferences from other threads when defining the abstract behaviors of the current thread. For example, the synchronous queue [16] is a concurrent transfer channel in which each producer presenting an item must wait for a consumer to take this item, and vice versa. The corresponding abstract operations are no longer atomic. We used our logic to prove the contextual refinement between the concrete implementation (from [16], used in Java 6) and a more abstract synchronous queue. The refinement ensures that if a producer (or a consumer) is blocked at the concrete level, it must also be blocked at the source level. Proving equivalence between optimized algorithms and original ones. We also use our logic to show variants of concurrent algorithms are correct optimizations of the original implementations. In this case, we show equivalence (in fact, contextual equivalence), i.e., refinements of both directions. For instance, we proved the TTAS lock implementation is equivalent to the TAS lock implementation [6] for any client using the locks. The former tests the lock bit in a nested while loop until it appears to be free, and then uses the atomic getAndSet instruction to update the bit; while the latter directly tries getAndSet until success. The equivalence result between these two lock implementations shows that no client may observe their differences, including the differences on their termination behaviors (e.g., whether a
client thread may acquire the lock).It gives us the full correctness References of the TTAS lock.As an optimization of TAS lock,it preserves the [1]N.Benton.Simple relational correctness proofs for static analyses and behaviors on both functionality and termination of the latter. program transformations.In POPL,pages 14-25,2004. [2]S.Doherty,L.Groves,V.Luchangco,and M.Moir.Formal verifica- 7.Related Work and Conclusion tion of a practical lock-free queue algorithm.In FORTE,pages 97- Hoffmann et al.[7]propose a program logic to verify lock-freedom 114.2004. of concurrent objects.They reason about termination quantitatively [3]X.Feng.Local rely-guarantee reasoning.In POPL,pages 315-327. by introducing tokens,and model the environment's interference 2009. over the current thread's termination in terms of token transfer.The [4]I.Filipovic.P.O'Hearn,N.Rinetzky.and H.Yang.Abstraction idea is simple and natural,but their logic has very limited support for concurrent objects.Theor.Comput.Sci.,411(51-52):4379-4398. of local reasoning.One needs to know the total number of tokens 2010. needed by each thread (which may have multiple while loops)and [5]A.Gotsman.B.Cook.M.J.Parkinson,and V.Vafeiadis.Proving that the (fixed)number of threads,to calculate the number of tokens for non-blocking algorithms don't block.In POPL,pages 16-28,2009. a thread to lose or initially own.This requirement also disallows [6]M.Herlihy and N.Shavit.The Art of Multiprocessor Programming. their logic to reason about programs with infinite nondeterminism. Morgan Kaufmann,2008. Here we allow a thread to set its effect bit in R/G without knowing [7]J.Hoffmann,M.Marmar,and Z.Shao.Quantitative reasoning for the details of other threads;and other threads can determine by proving lock-freedom.In L/CS.pages 124-133.2013. themselves how many tokens they gain.We also introduce the [8]C.B.Jones.Tentative steps toward a development method for inter- HIDE-W rule to hide the number of tokens and to support infinite fering programs.ACM Trans.Program.Lang.Syst.,5(4):596-619. nondeterminism.Another key difference is that our logic supports 1983. verification of refinement,which is not supported by their logic. [9]X.Leroy.A formally verified compiler back-end.J.Autom.Reason., Gotsman et al.[5]propose program logic and tools to verify 43:363-446.December2009. lock-freedom.Their approach is more heavyweight in that they [10]H.Liang and X.Feng.Modular verification of linearizability with need temporal assertions in the rely/guarantee conditions to spec- non-fixed linearization points.In PLDI.pages 459-470,2013. ify interference between threads,and the rely/guarantee conditions [11]H.Liang,X.Feng,and M.Fu.A rely-guarantee-based simulation for need to be specified iteratively in multiple rounds to break circu- verifying concurrent program transformations.In POPL,pages 455- lar reliance on progress.Moreover,their work relies on third-party 468.2012. tools to check termination of individual threads as closed sequential [12]H.Liang.J.Hoffmann,X.Feng,and Z.Shao.Characterizing progress programs.Therefore they do not have a set of self-contained pro- gram logic rules and a coherent meta-theory as we do.Like Hoff- properties of concurrent objects via contextual refinements.In CON. CUR,pages227-241,2013. mann et al.[7].they do not support refinement verification either. [13]H.Liang,X.Feng,and Z.Shao.Compositional verification of As we explained in Sec.1,none of recent work on general termination-preserving refinement of concurrent programs(extended refinement verification of concurrent programs [11,21,22]and version).Technical report,Univ.of Science and Technology of China on verifying linearizability of concurrent objects [10,23](which May 2014.http://kyhcs.ustcsz.edu.cn/relconcur/rgsimt. can be viewed as a specialized refinement problem)preserves [14]M.M.Michael and M.L.Scott.Simple,fast.and practical non- termination.Seveik et al.equipped their simulation proofs for blocking and blocking concurrent queue algorithms.In PODC,pages CompCertTSO [17]with a well-founded order,following the 267-275.1996. CompCert approach.Their approach is similar to our second at- [15]M.Parkinson,R.Bornat,and C.Calcagno.Variables as resource in tempt explained in Sec.2.thus cannot be applied to prove lock- Hoare logics.In LICS.pages 137-146.2006. freedom of concurrent objects. [16]W.N.Scherer III.D.Lea.and M.L.Scott.Scalable synchronous Conclusion and future work.We propose a new compositional queues.In PPoPP,pages 147-156,2006. simulation RGSim-T to verify termination-preserving refinement [17]J.Sevefk.V.Vafeiadis.F.Z.Nardelli,S.Jagannathan,and P.Sewell between concurrent programs.We also give a rely/guarantee pro- CompCertTSO:A verified compiler for relaxed-memory concurrency gram logic as a proof theory for the simulation.Our logic is the first J.ACM.60(3):22.2013. to support compositional verification of termination-preserving re- [18]T.A.L.Sewell,M.O.Myreen,and G.Klein.Translation validation finement.The simulation and logic are general.They can be used for a verified os kernel.In PLDI,pages 471-482,2013. to verify both correctness of optimizations(where the source may [19]K.Stolen.A method for the development of totally correct shared- not necessarily terminate)and lock-freedom of concurrent objects. state parallel programs.In CONCUR.pages 510-525,1991. As future work,we would like to further extend them with the tech [20]R.K.Treiber.System programming:coping with parallelism.Tech- niques of pending thread pools and speculations [10]to verify ob nical Report RJ 5118,IBM Almaden Research Center,1986. jects with non-fixed linearization points.We also hope to explore [21]A.Turon,D.Dreyer,and L.Birkedal.Unifying refinement and hoare- the possibility of building tools to automate the verification. style reasoning in a logic for higher-order concurrency.In /CFP,pages 377-390.2013. Acknowledgments [22]A.Turon,J.Thamsborg,A.Ahmed,L.Birkedal,and D.Dreyer. We thank anonymous referees for their suggestions and com- Logical relations for fine-grained concurrency.In POPL,pages 343- ments.This work is supported in part by China Scholarship Coun- 356.2013. cil,National Natural Science Foundation of China (NSFC)under [23]V.Vafeiadis.Modular fine-grained concurrency verification.PhD Grant Nos.61229201,61379039 and 91318301.and the National thesis,University of Cambridge,Computer Laboratory,2008. Hi-Tech Research and Development Program of China(Grant [24]V.Vafeiadis.Concurrent separation logic and operational semantics No.2012AA010901).It is also supported in part by DARPA In MFPS,pages 335-351,2011. grants FA8750-10-2-0254 and FA8750-12-2-0293,ONR grant [25]H.Yang.Relational separation logic.Theoretical Computer Science N000141210478,and NSF grants 0915888 and 1065451.Any 375:308-334.2007. opinions,findings,and conclusions contained in this document are those of the authors and do not reflect the views of these agencies
client thread may acquire the lock). It gives us the full correctness of the TTAS lock. As an optimization of TAS lock, it preserves the behaviors on both functionality and termination of the latter. 7. Related Work and Conclusion Hoffmann et al. [7] propose a program logic to verify lock-freedom of concurrent objects. They reason about termination quantitatively by introducing tokens, and model the environment’s interference over the current thread’s termination in terms of token transfer. The idea is simple and natural, but their logic has very limited support of local reasoning. One needs to know the total number of tokens needed by each thread (which may have multiple while loops) and the (fixed) number of threads, to calculate the number of tokens for a thread to lose or initially own. This requirement also disallows their logic to reason about programs with infinite nondeterminism. Here we allow a thread to set its effect bit in R/G without knowing the details of other threads; and other threads can determine by themselves how many tokens they gain. We also introduce the HIDE-W rule to hide the number of tokens and to support infinite nondeterminism. Another key difference is that our logic supports verification of refinement, which is not supported by their logic. Gotsman et al. [5] propose program logic and tools to verify lock-freedom. Their approach is more heavyweight in that they need temporal assertions in the rely/guarantee conditions to specify interference between threads, and the rely/guarantee conditions need to be specified iteratively in multiple rounds to break circular reliance on progress. Moreover, their work relies on third-party tools to check termination of individual threads as closed sequential programs. Therefore they do not have a set of self-contained program logic rules and a coherent meta-theory as we do. Like Hoffmann et al. [7], they do not support refinement verification either. As we explained in Sec. 1, none of recent work on general refinement verification of concurrent programs [11, 21, 22] and on verifying linearizability of concurrent objects [10, 23] (which can be viewed as a specialized refinement problem) preserves termination. Sevˇc´ık et al. equipped their simulation proofs for ˇ CompCertTSO [17] with a well-founded order, following the CompCert approach. Their approach is similar to our second attempt explained in Sec. 2, thus cannot be applied to prove lockfreedom of concurrent objects. Conclusion and future work. We propose a new compositional simulation RGSim-T to verify termination-preserving refinement between concurrent programs. We also give a rely/guarantee program logic as a proof theory for the simulation. Our logic is the first to support compositional verification of termination-preserving re- finement. The simulation and logic are general. They can be used to verify both correctness of optimizations (where the source may not necessarily terminate) and lock-freedom of concurrent objects. As future work, we would like to further extend them with the techniques of pending thread pools and speculations [10] to verify objects with non-fixed linearization points. We also hope to explore the possibility of building tools to automate the verification. Acknowledgments We thank anonymous referees for their suggestions and comments. This work is supported in part by China Scholarship Council, National Natural Science Foundation of China (NSFC) under Grant Nos. 61229201, 61379039 and 91318301, and the National Hi-Tech Research and Development Program of China (Grant No. 2012AA010901). It is also supported in part by DARPA grants FA8750-10-2-0254 and FA8750-12-2-0293, ONR grant N000141210478, and NSF grants 0915888 and 1065451. Any opinions, findings, and conclusions contained in this document are those of the authors and do not reflect the views of these agencies. References [1] N. Benton. Simple relational correctness proofs for static analyses and program transformations. In POPL, pages 14–25, 2004. [2] S. Doherty, L. Groves, V. Luchangco, and M. Moir. Formal verification of a practical lock-free queue algorithm. In FORTE, pages 97– 114, 2004. [3] X. Feng. Local rely-guarantee reasoning. In POPL, pages 315–327, 2009. [4] I. Filipovic, P. O’Hearn, N. Rinetzky, and H. Yang. Abstraction for concurrent objects. Theor. Comput. Sci., 411(51-52):4379–4398, 2010. [5] A. Gotsman, B. Cook, M. J. Parkinson, and V. Vafeiadis. Proving that non-blocking algorithms don’t block. In POPL, pages 16–28, 2009. [6] M. Herlihy and N. Shavit. The Art of Multiprocessor Programming. Morgan Kaufmann, 2008. [7] J. Hoffmann, M. Marmar, and Z. Shao. Quantitative reasoning for proving lock-freedom. In LICS, pages 124–133, 2013. [8] C. B. Jones. Tentative steps toward a development method for interfering programs. ACM Trans. Program. Lang. Syst., 5(4):596–619, 1983. [9] X. Leroy. A formally verified compiler back-end. J. Autom. Reason., 43:363–446, December 2009. [10] H. Liang and X. Feng. Modular verification of linearizability with non-fixed linearization points. In PLDI, pages 459–470, 2013. [11] H. Liang, X. Feng, and M. Fu. A rely-guarantee-based simulation for verifying concurrent program transformations. In POPL, pages 455– 468, 2012. [12] H. Liang, J. Hoffmann, X. Feng, and Z. Shao. Characterizing progress properties of concurrent objects via contextual refinements. In CONCUR, pages 227–241, 2013. [13] H. Liang, X. Feng, and Z. Shao. Compositional verification of termination-preserving refinement of concurrent programs (extended version). Technical report, Univ. of Science and Technology of China, May 2014. http://kyhcs.ustcsz.edu.cn/relconcur/rgsimt. [14] M. M. Michael and M. L. Scott. Simple, fast, and practical nonblocking and blocking concurrent queue algorithms. In PODC, pages 267–275, 1996. [15] M. Parkinson, R. Bornat, and C. Calcagno. Variables as resource in Hoare logics. In LICS, pages 137–146, 2006. [16] W. N. Scherer III, D. Lea, and M. L. Scott. Scalable synchronous queues. In PPoPP, pages 147–156, 2006. [17] J. Sevˇc´ık, V. Vafeiadis, F. Z. Nardelli, S. Jagannathan, and P. Sewell. ˇ CompCertTSO: A verified compiler for relaxed-memory concurrency. J. ACM, 60(3):22, 2013. [18] T. A. L. Sewell, M. O. Myreen, and G. Klein. Translation validation for a verified os kernel. In PLDI, pages 471–482, 2013. [19] K. Stølen. A method for the development of totally correct sharedstate parallel programs. In CONCUR, pages 510–525, 1991. [20] R. K. Treiber. System programming: coping with parallelism. Technical Report RJ 5118, IBM Almaden Research Center, 1986. [21] A. Turon, D. Dreyer, and L. Birkedal. Unifying refinement and hoarestyle reasoning in a logic for higher-order concurrency. In ICFP, pages 377–390, 2013. [22] A. Turon, J. Thamsborg, A. Ahmed, L. Birkedal, and D. Dreyer. Logical relations for fine-grained concurrency. In POPL, pages 343– 356, 2013. [23] V. Vafeiadis. Modular fine-grained concurrency verification. PhD thesis, University of Cambridge, Computer Laboratory, 2008. [24] V. Vafeiadis. Concurrent separation logic and operational semantics. In MFPS, pages 335–351, 2011. [25] H. Yang. Relational separation logic. Theoretical Computer Science, 375:308–334, 2007