Examples.The first example is the T"and S'in Sec.2.We show Counter and its variants its proof in our logic in Fig.10(a)(for simplicity,below we always Linearizability Lock-Freedom Treiber stack [20] assume the ownership of variables).We use X for the counter at Michael-Scott lock-free queue [14] the source,and the rely/guarantee conditions say that the counters DGLM lock-free queue [2] Non-Atomic Object Correctness Synchronous queue [16] at the two levels can be updated simultaneously with the effect bit Correctness of Optimized Algo Counter vs.its variants true.The loop invariant above line 2 says that we should have at (Equivalence) TAS lock vs.TTAS lock6可 least one token to execute the loop.The loop body is verified with zero tokens,and should finally restore the invariant token number Figure 11.Verified examples using our logic. 1.The gaining of the token may be due to a successful cas at line 4 that corresponds to source steps,or caused by the environment interferences.More specifically,the assertion following line 3 says Proving linearizability and lock-freedom together for concurrent that we can gain a token if the counters have been updated.If the objects.It has been shown [12]that the verification of lineariz- counters are not updated before the cas at line 4,the cas succeeds ability and lock-freedom together can be reduced to verifying a and we show the detailed proof at the right part of Fig.10(a),in contextual refinement that preserves the termination of any client which we execute one iteration of the source code and gain a token programs.That is,for any client as the context,the termination- (applying the ATOM rule). preserving refinement CC should hold.Here we use C This example shows the most straightforward understanding of for the concrete implementation of the object,and C for the corre- the WHILE rule:we pay a token at the beginning of an iteration sponding abstract atomic operations.C(or C)denotes the and should be able to gain another token during the execution of whole program where the client accesses the object via method the iteration.The next example is more subtle (though simpler). calls to C (or C). As shown in Fig.10(b),it is a locally-terminating while loop (i.e. The compositionality rules of our logic (Fig.7)allow us to ver- a loop that terminates regardless of environment interferences) ify the above contextual refinement by proving R,G,I[P)C We prove it refines skip under the environment Emp.The loop C(Q).Then we apply the U2B rule and turn the relational ver- invariant above line 2 says that the number of tokens equals the ification to unary reasoning.As in a normal linearizability proof value of i.If the loop condition (i>o)is satisfied,we pay one (e.g.,[10,23]),we need to find a single step of C(i.e.,the lin- token.In the proof of the loop body,we do not (and are not able earization point)that corresponds to the atomic step of C.Here we to)gain more tokens.Instead,the value of i will be decreased in also have to prove lock-freedom:the failure to make progress (i.e., the iteration,enabling us to restore the equality between the number finish an abstract operation)of a thread must be caused by success- of tokens and i. ful progress of its environment,which can be ensured by the WHILE rule (in Fig.9)in our logic. Other rules and discussions.Another important rule is the We have used the above approach to verify several lineariz- HIDE-W rule in Fig.9.It shows that tokens are just an auxiliary able and lock-free objects,including Treiber stack [20].Michael- tool,which could be safely discarded (by using_w)when the Scott lock-free queue [14]and DGLM queue [2].We can further termination-preservation of a command C(say,a while loop)is extend the logic in this paper with the techniques [10]for verify- already established.As we mentioned in Sec.2.the HIDE-W rule ing linearizability of algorithms with non-fixed linearization points, is crucial to handle infinite nondeterminism.It is also important for to support more sophisticated examples such as HSY elimination- local reasoning,so that when we verify a thread,we do not have based stack and Harris-Michael lock-free list. to calculate and specify in the precondition the number of tokens needed by all the while loops.For nested loops,we could use the Verifying concurrent objects whose abstract operations are not HIDE-W rule to hide the tokens needed by the inner loop,and use atomic.Sometimes we cannot define single atomic operations as the FRAME rule to add back the tokens needed for the outer loop the abstract specification of a concurrent object.For objects that later when we compose the inner loop with other parts of the outer implement synchronization between threads.we may have to ex- loop body plicitly take into account the interferences from other threads when The unary FRAME rule in Fig.9 is similar to the binary one in defining the abstract behaviors of the current thread.For exam- Fig.7.Other rules can be found in our TR [13],which are very ple,the synchronous queue [16]is a concurrent transfer channel in similar to those in LRG [3],but we give different interpretations to which each producer presenting an item must wait for a consumer assertions and actions. to take this item,and vice versa.The corresponding abstract opera- The binary rules(in Fig.7)and the unary rules(in Fig.9)gives tions are no longer atomic.We used our logic to prove the contex- us a full proof theory for termination-preserving refinement.We tual refinement between the concrete implementation (from [16] want to remind the readers that the logic does not ensure termina- used in Java 6)and a more abstract synchronous queue.The refine- tion of programs,therefore it is not a logic for total correctness.On ment ensures that if a producer (or a consumer)is blocked at the the other hand,if we restrict the source code to skip (which always concrete level,it must also be blocked at the source level. terminates),then our unary rules can be viewed as a proof theory for the total correctness of concurrent programs. Proving equivalence between optimized algorithms and original ones. We also use our logic to show variants of concurrent algo- Also note that the use of a natural number w as the while- specific metric is to simplify the presentation only.It is easy to rithms are correct optimizations of the original implementations extend our work to support other types of the while-specific metrics In this case,we show equivalence (in fact,contextual equivalence) i.e..refinements of both directions. for more complicated examples. For instance,we proved the TTAS lock implementation is equivalent to the TAS lock implementation [6]for any client using 6.More Examples the locks.The former tests the lock bit in a nested while loop until it appears to be free,and then uses the atomic getAndSet instruction We have seen a few small examples that illustrate the use of our to update the bit;while the latter directly tries getAndSet until logic,in particular,the WHILE rule.In this section,we discuss other success.The equivalence result between these two lock implemen examples that we have proved,which are summarized in Fig.11 tations shows that no client may observe their differences,includ- Their proofs are in TR [13]. ing the differences on their termination behaviors (e.g.,whether aExamples. The first example is the T ′′ c and S ′ in Sec. 2. We show its proof in our logic in Fig. 10(a) (for simplicity, below we always assume the ownership of variables). We use X for the counter at the source, and the rely/guarantee conditions say that the counters at the two levels can be updated simultaneously with the effect bit true. The loop invariant above line 2 says that we should have at least one token to execute the loop. The loop body is verified with zero tokens, and should finally restore the invariant token number 1. The gaining of the token may be due to a successful cas at line 4 that corresponds to source steps, or caused by the environment interferences. More specifically, the assertion following line 3 says that we can gain a token if the counters have been updated. If the counters are not updated before the cas at line 4, the cas succeeds and we show the detailed proof at the right part of Fig. 10(a), in which we execute one iteration of the source code and gain a token (applying the ATOM+ rule). This example shows the most straightforward understanding of the WHILE rule: we pay a token at the beginning of an iteration and should be able to gain another token during the execution of the iteration. The next example is more subtle (though simpler). As shown in Fig. 10(b), it is a locally-terminating while loop (i.e., a loop that terminates regardless of environment interferences). We prove it refines skip under the environment Emp. The loop invariant above line 2 says that the number of tokens equals the value of i. If the loop condition (i>0) is satisfied, we pay one token. In the proof of the loop body, we do not (and are not able to) gain more tokens. Instead, the value of i will be decreased in the iteration, enabling us to restore the equality between the number of tokens and i. Other rules and discussions. Another important rule is the HIDE-W rule in Fig. 9. It shows that tokens are just an auxiliary tool, which could be safely discarded (by using ⌊ ⌋w) when the termination-preservation of a command C (say, a while loop) is already established. As we mentioned in Sec. 2, the HIDE-W rule is crucial to handle infinite nondeterminism. It is also important for local reasoning, so that when we verify a thread, we do not have to calculate and specify in the precondition the number of tokens needed by all the while loops. For nested loops, we could use the HIDE-W rule to hide the tokens needed by the inner loop, and use the FRAME rule to add back the tokens needed for the outer loop later when we compose the inner loop with other parts of the outer loop body. The unary FRAME rule in Fig. 9 is similar to the binary one in Fig. 7. Other rules can be found in our TR [13], which are very similar to those in LRG [3], but we give different interpretations to assertions and actions. The binary rules (in Fig. 7) and the unary rules (in Fig. 9) gives us a full proof theory for termination-preserving refinement. We want to remind the readers that the logic does not ensure termination of programs, therefore it is not a logic for total correctness. On the other hand, if we restrict the source code to skip (which always terminates), then our unary rules can be viewed as a proof theory for the total correctness of concurrent programs. Also note that the use of a natural number w as the whilespecific metric is to simplify the presentation only. It is easy to extend our work to support other types of the while-specific metrics for more complicated examples. 6. More Examples We have seen a few small examples that illustrate the use of our logic, in particular, the WHILE rule. In this section, we discuss other examples that we have proved, which are summarized in Fig. 11. Their proofs are in TR [13]. Linearizability & Lock-Freedom Counter and its variants Treiber stack [20] Michael-Scott lock-free queue [14] DGLM lock-free queue [2] Non-Atomic Object Correctness Synchronous queue [16] Correctness of Optimized Algo Counter vs. its variants (Equivalence) TAS lock vs. TTAS lock [6] Figure 11. Verified examples using our logic. Proving linearizability and lock-freedom together for concurrent objects. It has been shown [12] that the verification of linearizability and lock-freedom together can be reduced to verifying a contextual refinement that preserves the termination of any client programs. That is, for any client as the context C, the terminationpreserving refinement C[C] ⊑ C[C] should hold. Here we use C for the concrete implementation of the object, and C for the corresponding abstract atomic operations. C[C] (or C[C]) denotes the whole program where the client accesses the object via method calls to C (or C). The compositionality rules of our logic (Fig. 7) allow us to verify the above contextual refinement by proving R, G, I ⊢ {P}C C{Q}. Then we apply the U2B rule and turn the relational verification to unary reasoning. As in a normal linearizability proof (e.g., [10, 23]), we need to find a single step of C (i.e., the linearization point) that corresponds to the atomic step of C. Here we also have to prove lock-freedom: the failure to make progress (i.e., finish an abstract operation) of a thread must be caused by successful progress of its environment, which can be ensured by the WHILE rule (in Fig. 9) in our logic. We have used the above approach to verify several linearizable and lock-free objects, including Treiber stack [20], MichaelScott lock-free queue [14] and DGLM queue [2]. We can further extend the logic in this paper with the techniques [10] for verifying linearizability of algorithms with non-fixed linearization points, to support more sophisticated examples such as HSY eliminationbased stack and Harris-Michael lock-free list. Verifying concurrent objects whose abstract operations are not atomic. Sometimes we cannot define single atomic operations as the abstract specification of a concurrent object. For objects that implement synchronization between threads, we may have to explicitly take into account the interferences from other threads when defining the abstract behaviors of the current thread. For example, the synchronous queue [16] is a concurrent transfer channel in which each producer presenting an item must wait for a consumer to take this item, and vice versa. The corresponding abstract operations are no longer atomic. We used our logic to prove the contextual refinement between the concrete implementation (from [16], used in Java 6) and a more abstract synchronous queue. The refinement ensures that if a producer (or a consumer) is blocked at the concrete level, it must also be blocked at the source level. Proving equivalence between optimized algorithms and original ones. We also use our logic to show variants of concurrent algorithms are correct optimizations of the original implementations. In this case, we show equivalence (in fact, contextual equivalence), i.e., refinements of both directions. For instance, we proved the TTAS lock implementation is equivalent to the TAS lock implementation [6] for any client using the locks. The former tests the lock bit in a nested while loop until it appears to be free, and then uses the atomic getAndSet instruction to update the bit; while the latter directly tries getAndSet until success. The equivalence result between these two lock implementations shows that no client may observe their differences, including the differences on their termination behaviors (e.g., whether a