A Program Logic for Concurrent Objects under Fair Scheduling Hongjin Liang Xinyu Feng School of Computer Science and Technology Suzhou Institute for Advanced Study University of Science and Technology of China Ihj1018@ustc.edu.cn xyfengOustc.edu.cn Abstract In addition to linearizability,a safety property,object imple- Existing work on verifying concurrent objects is mostly concerned mentations are expected to also satisfy progress properties.The with safety only,e.g.,partial correctness or linearizability.Although non-blocking progress properties,such as wait-freedom and lock- there has been recent work verifying lock-freedom of non-blocking freedom which have been studied a lot (e.g..[5,10,16,24]).guar- antee the termination of the method calls independently of how the objects,much less efforts are focused on deadlock-freedom and threads are scheduled.Unfortunately these properties are too strong starvation-freedom,progress properties of blocking objects.These properties are more challenging to verify than lock-freedom because to be satisfied by algorithms with blocking synchronization.For clients using lock-based objects,a delay of a thread holding a lock they allow the progress of one thread to depend on the progress of will block other threads requesting the lock.Thus their progress another,assuming fair scheduling. relies on the assumption that every thread holding the lock will We propose LiLi,a new rely-guarantee style program logic for eventually be scheduled to release it.This assumption requires fair verifying linearizability and progress together for concurrent objects under fair scheduling.The rely-guarantee style logic unifies thread- scheduling,i.e.,every thread gets eventually executed.As summa- modular reasoning about both starvation-freedom and deadlock rized by Herlihy and Shavit [14].there are two desirable progress freedom in one framework.It also establishes progress-aware properties for blocking algorithms,both assuming fair scheduling: abstraction for concurrent objects,which can be applied when Deadlock-freedom:In each fair execution,there always exists verifying safety and liveness of client code.We have successfully some method call that can finish.It disallows the situation in applied the logic to verify starvation-freedom or deadlock-freedom which multiple threads requesting locks are waiting for each of representative algorithms such as ticket locks,queue locks,lock- other to release the locks in hand.It ensures the absence of coupling lists,optimistic lists and lazy lists. livelock.but not starvation. Categories and Subject Descriptors D.2.4 [Software Engineer- Starvation-freedom:Every method call should finish in fair ing]:Software/Program Verification-Correctness proofs,Formal executions.It requires that every thread attempting to acquire methods;F.3.1 [Logics and Meanings of Programs]:Specifying a lock should eventually succeed and in the end release the and Verifying and Reasoning about Programs lock.Starvation-freedom is stronger than deadlock-freedom. Nevertheless it can often be achieved by using fair locks [13]. General Terms Theory,Verification Recent program logics for verifying concurrent objects either Keywords Concurrency,Progress,Program Logic,Rely-Guarantee prove only linearizability and ignore the issue of termination (e.g.,[6, Reasoning,Refinement 21,30,311),or aim for non-blocking progress properties(e.g.,[5. 10,16,24]),which cannot be applied to blocking algorithms that progress only under fair scheduling.The fairness assumption intro- 1.Introduction duces complicated interdependencies among progress properties of A concurrent object or library provides a set of methods that al- threads,making it incredibly more challenging to verify the lock- low multiple client threads to manipulate the shared data structure. based algorithms than their non-blocking counterparts.We will Blocking synchronization (i.e..mutual exclusion locks),as a straight explain the challenges in detail in Sec.2. forward technique to ensure exclusive accesses and to control the It is important to note that,although there has been much work interference,has been widely-used in object implementations to on deadlock detection or deadlock-freedom verification (e.g.,[4,20. achieve linearizability,which ensures the object methods behave as 32]),deadlock-freedom is often defined as a safety property,which atomic operations in a concurrent setting. ensures the lack of circular waiting for locks.It does not prevent live- lock or non-termination inside the critical section.Another limitation of this kind of work is that it often assumes built-in lock primitives, and lacks support of ad-hoc synchronization (e.g.,mutual exclusion Permission to make digital or hard copies of all or part of this work for classroom use is granted without fee wided that are not made or distribut achieved using spin-locks implemented by the programmers).The for profit or commercial advantage and that copic sbear this notice and the full citati deadlock-freedom we discuss in this paper is a liveness property and on the first page.Copyrights for components of this work owned by others than AC must be honored.Abstracting with credit is permitted.To copy otherwise,or republish. its definition does not rely on built-in lock primitives.We discuss to post on servers or to redistribute to lists,requires prior specific permission and/or a more related work on liveness verification in Sec.8. fee.Request permissions from Permissions@acm.org. In this paper we propose LiLi,a new rely-guarantee style logic POPL'/6,January 20-22,2016,St.Petersburg,FL.USA for concurrent objects under fair scheduling.LiLi is the first program logic that unifies verification of linearizability,starvation-freedom and deadlock-freedom in one framework (the name LiLi stands for 385
A Program Logic for Concurrent Objects under Fair Scheduling Hongjin Liang Xinyu Feng School of Computer Science and Technology & Suzhou Institute for Advanced Study University of Science and Technology of China lhj1018@ustc.edu.cn xyfeng@ustc.edu.cn Abstract Existing work on verifying concurrent objects is mostly concerned with safety only, e.g., partial correctness or linearizability. Although there has been recent work verifying lock-freedom of non-blocking objects, much less efforts are focused on deadlock-freedom and starvation-freedom, progress properties of blocking objects. These properties are more challenging to verify than lock-freedom because they allow the progress of one thread to depend on the progress of another, assuming fair scheduling. We propose LiLi, a new rely-guarantee style program logic for verifying linearizability and progress together for concurrent objects under fair scheduling. The rely-guarantee style logic unifies threadmodular reasoning about both starvation-freedom and deadlockfreedom in one framework. It also establishes progress-aware abstraction for concurrent objects, which can be applied when verifying safety and liveness of client code. We have successfully applied the logic to verify starvation-freedom or deadlock-freedom of representative algorithms such as ticket locks, queue locks, lockcoupling lists, optimistic lists and lazy lists. Categories and Subject Descriptors D.2.4 [Software Engineering]: Software/Program Verification – Correctness proofs, Formal methods; F.3.1 [Logics and Meanings of Programs]: Specifying and Verifying and Reasoning about Programs General Terms Theory, Verification Keywords Concurrency, Progress, Program Logic, Rely-Guarantee Reasoning, Refinement 1. Introduction A concurrent object or library provides a set of methods that allow multiple client threads to manipulate the shared data structure. Blocking synchronization (i.e., mutual exclusion locks), as a straightforward technique to ensure exclusive accesses and to control the interference, has been widely-used in object implementations to achieve linearizability, which ensures the object methods behave as atomic operations in a concurrent setting. In addition to linearizability, a safety property, object implementations are expected to also satisfy progress properties. The non-blocking progress properties, such as wait-freedom and lockfreedom which have been studied a lot (e.g., [5, 10, 16, 24]), guarantee the termination of the method calls independently of how the threads are scheduled. Unfortunately these properties are too strong to be satisfied by algorithms with blocking synchronization. For clients using lock-based objects, a delay of a thread holding a lock will block other threads requesting the lock. Thus their progress relies on the assumption that every thread holding the lock will eventually be scheduled to release it. This assumption requires fair scheduling, i.e., every thread gets eventually executed. As summarized by Herlihy and Shavit [14], there are two desirable progress properties for blocking algorithms, both assuming fair scheduling: • Deadlock-freedom: In each fair execution, there always exists some method call that can finish. It disallows the situation in which multiple threads requesting locks are waiting for each other to release the locks in hand. It ensures the absence of livelock, but not starvation. • Starvation-freedom: Every method call should finish in fair executions. It requires that every thread attempting to acquire a lock should eventually succeed and in the end release the lock. Starvation-freedom is stronger than deadlock-freedom. Nevertheless it can often be achieved by using fair locks [13]. Recent program logics for verifying concurrent objects either prove only linearizability and ignore the issue of termination (e.g., [6, 21, 30, 31]), or aim for non-blocking progress properties (e.g., [5, 10, 16, 24]), which cannot be applied to blocking algorithms that progress only under fair scheduling. The fairness assumption introduces complicated interdependencies among progress properties of threads, making it incredibly more challenging to verify the lockbased algorithms than their non-blocking counterparts. We will explain the challenges in detail in Sec. 2. It is important to note that, although there has been much work on deadlock detection or deadlock-freedom verification (e.g., [4, 20, 32]), deadlock-freedom is often defined as a safety property, which ensures the lack of circular waiting for locks. It does not prevent livelock or non-termination inside the critical section. Another limitation of this kind of work is that it often assumes built-in lock primitives, and lacks support of ad-hoc synchronization (e.g., mutual exclusion achieved using spin-locks implemented by the programmers). The deadlock-freedom we discuss in this paper is a liveness property and its definition does not rely on built-in lock primitives. We discuss more related work on liveness verification in Sec. 8. In this paper we propose LiLi, a new rely-guarantee style logic for concurrent objects under fair scheduling. LiLi is the first program logic that unifies verification of linearizability, starvation-freedom and deadlock-freedom in one framework (the name LiLi stands for Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Permissions@acm.org. POPL’16, January 20–22, 2016, St. Petersburg, FL, USA c 2016 ACM. 978-1-4503-3549-2/16/01...$15.00 http://dx.doi.org/10.1145/2837614.2837635 385
Linearizability and Liveness).It supports verification of both mutex- (a)abstract operation INC:; based pessimistic algorithms(including fine-grained ones such as dfInc: lock-coupling lists)and optimistic ones such as optimistic lists and 1 local b :false,ri lazy lists.The unified approach allows us to prove in the same logic 2 while (!b){b :cas(&L,0,cid);/lock L for instance,the lock-coupling list algorithm is starvation-free if we 3 r :x;x :=r 1;/critical section use fair locks,e.g.,ticket locks [25],and is deadlock-free if regular 4 L :0;/unlock L test-and-set (TAS)based spin locks are used.Our work is based on (b)deadlock-free implementation df Inc using a test-and-set lock earlier work on concurrency verification,but we make the following new contributions: sfInc: 11oca11,0,r: 2 i:=getAndInc(next); We divide environment interference that affects progress of a 3 o :owner;while (i !o)o :owner; thread into two classes.namely blocking and delay.We show 4 r=x;x:=r+1; /critical section different occurrences of them correspond to the classification of owner :i+1; progress into wait-freedom,lock-freedom,starvation-freedom (c)starvation-free implementation sf Inc using a ticket lock and deadlock-freedom (see Sec.2.2.1 and Sec.6).Recognizing the two classes of interference allows us to come up with Figure 1.Counters different mechanisms in our program logic to reason about them separately.Our logic also provides parameterized specifications which can be instantiated to choose different combinations of objects too in Sec.6.Finally,we summarize the examples we have the mechanisms.This gives us a unified program logic that can verified in Sec.7.and discuss related work and conclude in Sec.8. verify different progress properties using the same set of rules. We propose two novel mechanisms.definite actions and strat- 2.Informal Development ified tokens,to reason about blocking and delay,respectively Below we first give an overview of the traditional rely-guarantee They are also our key techniques to avoid circularity in rely. logic for safety proofs [18],and the way to encode linearizability guarantee style liveness reasoning.A definite action character- verification in the logic.Then we explain the challenges and our izes a thread's progress that does not rely on the progress of the ideas in supporting liveness verification under fair scheduling. environment.Each blocked thread waits for a queue of definite actions.Starvation-freedom requires the length of the queue be 2.1 Background strictly decreasing,while deadlock-freedom allows disciplined queue jumps based on the token-transfer ideas [16.24].To avoid Rely-guarantee reasoning.In rely-guarantee reasoning [18]. circular delay,we further generalize the token-transfer ideas by each thread is verified in isolation under some assumptions on stratifying tokens into multiple levels,which enables us to verify its environment (i.e.,the other threads in the system).The judgment complex algorithms that involve both nested locks and rollbacks is in the form of R,GPC,where the pre-and post- (e.g.,the optimistic list algorithm). conditions P and Q specify the initial and final states respectively. The rely condition R specifies the assumptions on the environ- By verifying linearizability and progress together,we can pro- ment,which are the permitted state transitions that the environment vide progress-aware abstractions for concurrent objects (see threads may have.The guarantee condition G specifies the possible Sec.5).Our logic is based on termination-preserving simula- transitions made by the thread itself.To ensure that parallel threads tions as the meta-theory,which establish contextual refinements can collaborate,the guarantee of each thread needs to satisfy the that assume fair scheduling at both the concrete and the abstract rely of every other thread. levels.We prove the contextual refinements are equivalent to linearizability and starvation-freedom/deadlock-freedom.The Encoding linearizability verification.As a working example. refinements allow us to replace object implementations with Fig.1(b)shows a counter object dfInc implemented with a test- progress-aware abstract specifications when the client code is and-set (TAS)lock L.Verifying linearizability of dfInc requires verified.As far as we know,our abstraction for deadlock-free us to prove it has the same abstract behaviors as INC in Fig.1(a), (and linearizable)objects has never been proposed before. which increments the counter x atomically. We have applied our logic to verify simple objects with coarse- Following previous work [21,24,31],one can extend a rely- grained synchronization using TAS locks,ticket locks [25]and guarantee logic to verify linearizability.We use an assertion various queue locks (including Anderson array-based locks. arem(C)to specify as an auxiliary state the abstract operation C to CLH locks and MCS locks)[13].For examples with more be fulfilled,and logically execute C at the linearization point (LP) permissive locking schemes,we have successfully verified the of the concrete implementation.For dfInc,we prove a judgment in two-lock queues,and various fine-grained and optimistic list the form of R.GP A arem(INC)dfIncfQ A arem(skip) algorithms.To the best of our knowledge,we are the first to Here R and G specify the object's actions (i.e.,lock acquire and formally verify the starvation-freedom/deadlock-freedom of release,and the counter updates at both the concrete and the abstract lock-coupling lists,optimistic lists and lazy lists. sides)made by the environment and the current thread respectively. P and O are relational assertions specifying the consistency relation Notice that with the assumption of fair scheduling,wait-freedom between the program states at the concrete and the abstract sides. and lock-freedom are equivalent to starvation-freedom and deadlock- The postcondition arem(skip)shows that at the end of df Inc there freedom,respectively.Therefore our logic can also be applied to is no abstract operation to fulfill. verify wait-free and lock-free algorithms.We discuss this in Sec.6. 2.2 In the rest of this paper,we first analyze the challenges and Challenges of Progress Verification explain our approach informally in Sec.2.Then we give the basic Progress properties of objects such as deadlock-freedom and technical setting in Sec.3,and present our logic in Sec.4,whose starvation-freedom have various termination requirements of ob- soundness theorem,together with the abstraction theorem,is given ject methods.They must be satisfied with interference from other in Sec.5.We discuss how our logic supports wait-free and lock-free threads considered,which makes the verification challenging. 386
Linearizability and Liveness). It supports verification of both mutexbased pessimistic algorithms (including fine-grained ones such as lock-coupling lists) and optimistic ones such as optimistic lists and lazy lists. The unified approach allows us to prove in the same logic, for instance, the lock-coupling list algorithm is starvation-free if we use fair locks, e.g., ticket locks [25], and is deadlock-free if regular test-and-set (TAS) based spin locks are used. Our work is based on earlier work on concurrency verification, but we make the following new contributions: • We divide environment interference that affects progress of a thread into two classes, namely blocking and delay. We show different occurrences of them correspond to the classification of progress into wait-freedom, lock-freedom, starvation-freedom and deadlock-freedom (see Sec. 2.2.1 and Sec. 6). Recognizing the two classes of interference allows us to come up with different mechanisms in our program logic to reason about them separately. Our logic also provides parameterized specifications, which can be instantiated to choose different combinations of the mechanisms. This gives us a unified program logic that can verify different progress properties using the same set of rules. • We propose two novel mechanisms, definite actions and stratified tokens, to reason about blocking and delay, respectively. They are also our key techniques to avoid circularity in relyguarantee style liveness reasoning. A definite action characterizes a thread’s progress that does not rely on the progress of the environment. Each blocked thread waits for a queue of definite actions. Starvation-freedom requires the length of the queue be strictly decreasing, while deadlock-freedom allows disciplined queue jumps based on the token-transfer ideas [16, 24]. To avoid circular delay, we further generalize the token-transfer ideas by stratifying tokens into multiple levels, which enables us to verify complex algorithms that involve both nested locks and rollbacks (e.g., the optimistic list algorithm). • By verifying linearizability and progress together, we can provide progress-aware abstractions for concurrent objects (see Sec. 5). Our logic is based on termination-preserving simulations as the meta-theory, which establish contextual refinements that assume fair scheduling at both the concrete and the abstract levels. We prove the contextual refinements are equivalent to linearizability and starvation-freedom/deadlock-freedom. The refinements allow us to replace object implementations with progress-aware abstract specifications when the client code is verified. As far as we know, our abstraction for deadlock-free (and linearizable) objects has never been proposed before. • We have applied our logic to verify simple objects with coarsegrained synchronization using TAS locks, ticket locks [25] and various queue locks (including Anderson array-based locks, CLH locks and MCS locks) [13]. For examples with more permissive locking schemes, we have successfully verified the two-lock queues, and various fine-grained and optimistic list algorithms. To the best of our knowledge, we are the first to formally verify the starvation-freedom/deadlock-freedom of lock-coupling lists, optimistic lists and lazy lists. Notice that with the assumption of fair scheduling, wait-freedom and lock-freedom are equivalent to starvation-freedom and deadlockfreedom, respectively. Therefore our logic can also be applied to verify wait-free and lock-free algorithms. We discuss this in Sec. 6. In the rest of this paper, we first analyze the challenges and explain our approach informally in Sec. 2. Then we give the basic technical setting in Sec. 3, and present our logic in Sec. 4, whose soundness theorem, together with the abstraction theorem, is given in Sec. 5. We discuss how our logic supports wait-free and lock-free (a) abstract operation INC: ; dfInc : 1 local b := false, r; 2 while (!b) { b := cas(&L, 0, cid); } // lock L 3 r := x; x := r + 1; // critical section 4 L := 0; // unlock L (b) deadlock-free implementation dfInc using a test-and-set lock sfInc : 1 local i, o, r; 2 i := getAndInc(next); 3 o := owner; while (i != o) { o := owner; } 4 r := x; x := r + 1; // critical section 5 owner := i + 1; (c) starvation-free implementation sfInc using a ticket lock Figure 1. Counters. objects too in Sec. 6. Finally, we summarize the examples we have verified in Sec. 7, and discuss related work and conclude in Sec. 8. 2. Informal Development Below we first give an overview of the traditional rely-guarantee logic for safety proofs [18], and the way to encode linearizability verification in the logic. Then we explain the challenges and our ideas in supporting liveness verification under fair scheduling. 2.1 Background Rely-guarantee reasoning. In rely-guarantee reasoning [18], each thread is verified in isolation under some assumptions on its environment (i.e., the other threads in the system). The judgment is in the form of R, G ` {P}C{Q}, where the pre- and postconditions P and Q specify the initial and final states respectively. The rely condition R specifies the assumptions on the environment, which are the permitted state transitions that the environment threads may have. The guarantee condition G specifies the possible transitions made by the thread itself. To ensure that parallel threads can collaborate, the guarantee of each thread needs to satisfy the rely of every other thread. Encoding linearizability verification. As a working example, Fig. 1(b) shows a counter object dfInc implemented with a testand-set (TAS) lock L. Verifying linearizability of dfInc requires us to prove it has the same abstract behaviors as INC in Fig. 1(a), which increments the counter x atomically. Following previous work [21, 24, 31], one can extend a relyguarantee logic to verify linearizability. We use an assertion arem(C) to specify as an auxiliary state the abstract operation C to be fulfilled, and logically execute C at the linearization point (LP) of the concrete implementation. For dfInc, we prove a judgment in the form of R, G ` {P ∧ arem(INC)}dfInc{Q ∧ arem(skip)}. Here R and G specify the object’s actions (i.e., lock acquire and release, and the counter updates at both the concrete and the abstract sides) made by the environment and the current thread respectively. P and Q are relational assertions specifying the consistency relation between the program states at the concrete and the abstract sides. The postcondition arem(skip) shows that at the end of dfInc there is no abstract operation to fulfill. 2.2 Challenges of Progress Verification Progress properties of objects such as deadlock-freedom and starvation-freedom have various termination requirements of object methods. They must be satisfied with interference from other threads considered, which makes the verification challenging. 386
2.2.1 Non-Termination Caused by Interference (line 5)and testing L2 (line 6).Suppose another thread t'does the In a concurrent setting,an object method may fail to terminate opposite:repeatedly acquiring L2 and testing L1.In this example the due to interference from its environment.Below we point out there acquirement of L2 by t'may cause t to fail its test of the availability are two different kinds of interference that may cause thread non- of L2.The test could have succeeded if t'did not interfere,so t' termination,namely blocking and delay.Let's first see a classic delays t.Conversely,the acquirement of Li by t may delay t'.Then deadlocking example the two threads can cause each other to continually roll back,and DL-12: DL-21: neither method progresses. lock L1;lock L2; lock L2;lock L1; Usually when delay is allowed,we need to make sure that the unlock L2;unlock L1; unlock L1;unlock L2; action delaying other threads is a"good"one in that it makes the The methods DL-12 and DL-21 may fail to terminate because executing thread progress (e.g.,a step towards termination).This is the case with the "benign delays"in the dfInc example and the of the circular dependency of locks.This non-termination is caused optimistic list example.But how do we tell if an action is good or by permanent blocking.That is,when DL-12 tries to acquire L2,it not?The acquirements of locks in Fig.2(b)do seem to be good could be blocked if the lock has been acquired by DL-21. For a second example,the call of the dfInc method (in Fig.1(b)) because they make the threads progress towards termination.How by the left thread below may never terminate. do we prevent such lock acquirements from delaying others,which may cause circular delay? dfInc();while (true)dfInc(); When the left thread tries to acquire the lock,even if the lock is 2.2.3 Ad-Hoc Synchronization and Dynamic Locks available at that time,the thread could be preempted by the right One may argue that the circularity can be avoided by simply thread,who gets the lock ahead of the left.Then the left thread enforcing certain orders of lock acquirements,which has been would fail at the cas command in the code of dfInc and have to a standard way to avoid"deadlock cycles"(note this is a safety loop at least one more round before termination.This may happen property,as we explained in Sec.1).Although lock orders can help infinitely many times,causing non-termination of the df Inc method liveness reasoning,it has many limitations in practice. on the left.In this case we say the progress of the left method is First,the approach cannot apply for ad-hoc synchronization.For delayed by its environment's successful acquirement of the lock. instance.there are no locks in the following deadlocking program The key difference between blocking and delay is that blocking is caused by the absence of certain environment actions,e.g. x:=1; y:=1: releasing a lock,while delay is caused by the occurrence of certain while (y =1)skip; while (x=1)skip; x:=0; y=0; environment actions,e.g.,acquiring the lock needed by the current thread (even if the lock is subsequently released).In other words.a Moreover,sometimes we need to look into the lock implementa blocked thread can progress only if its environment progresses first, tion to prove starvation-freedom.For instance,the df Inc in Fig.1(b) while a delayed thread can progress if we suspend the execution of using a TAS lock is deadlock-free but not starvation-free.If we re- its environment. place the TAS lock with a ticket lock,as in sfInc in Fig.1(c),the Lock-free algorithms disallow blocking (thus they do not rely on counter becomes starvation-free.Again,there are actually no locks fair scheduling),although delay is common,especially in optimistic in the programs if we have to work at a low abstraction level to look algorithms.Starvation-free algorithms allow (limited)blocking,but into lock implementations. not delay.As the dfInc example shows,delay from non-terminating Second,it can be difficult to enforce the ordering for fine-grained clients may cause starvation.Deadlock-free algorithms allow both algorithms on dynamic data structures (e.g.,lock-coupling list). (with restrictions).As the optimistic list in Fig.2(a)(explained in Since the data structure is changing dynamically,the set of locks Sec.2.3.4)shows,blocking and delay can be intertwined by the associated with the nodes is dynamic too.To allow a thread to combined use of blocking-based synchronization and optimistic con- determine dynamically the order of locks,we have to ensure its view currency,which makes the reasoning significantly more challenging of ordering is consistent with all the other threads in the system, than reasoning about lock-free algorithms. a challenge for thread-modular verification.Although dynamic How do we come up with general principles to allow blocking locks are supported in some previous work treating deadlock- and/or delay,but on the other hand to guarantee starvation-freedom freedom as a safety property (e.g.,[4.19)),it is unclear how to or deadlock-freedom? apply the techniques for general progress reasoning,with possible combination of locks,ad-hoc synchronization and rollbacks. 2.2.2 Avoid Circular Reasoning Rely-guarantee style logics provide the power of thread-modular 2.3 Our Approaches verification by circular reasoning.When proving the behaviors To address these problems,our logic enforces the following princi- of a thread t guarantee G,we assume that the behaviors of the ples to permit restricted forms of blocking and delay,but prevent environment thread t'satisfy R.Conversely,the proof of thread t' circular reasoning and non-termination. relies on the assumptions on the behaviors of thread t. First.if a thread is blocked.the events it waits for must eventually However,circular reasoning is usually unsound in liveness occur.To avoid circular reasoning,we find"definite actions"of verification [1].For instance,we could prove termination of each each thread,which under fair scheduling will definitely happen thread in the deadlocking example above,under the assumption that once enabled,regardless of the interference from the environment. each environment thread eventually releases the lock it owns.How Then each blocked thread needs to show it waits for only a finite do we avoid the circular reasoning without sacrificing rely-guarantee number of definite actions from the environment threads.They form style thread-modular reasoning? an acyclic queue,and there is always at least one of them enabled. The deadlocking example shows that we should avoid circular This is what we call"definite progress",which is crucial for proving reasoning to rule out circular dependency caused by blocking.Delay starvation-freedom may also cause circular dependency too.Figure 2(b)shows a thread Second,actions of a thread can delay others only if they are t using two locks.It first acquires L1 (line 1)and then tests whether making the executing object method to move towards termination L2 is available (line 2).If the test fails,the thread rolls back.It Each object method can only execute a finite number of such releases L1 (line 4),and then repeats the process of acquiring L1 delaying actions to avoid indefinite delay.This is enforced by 387
2.2.1 Non-Termination Caused by Interference In a concurrent setting, an object method may fail to terminate due to interference from its environment. Below we point out there are two different kinds of interference that may cause thread nontermination, namely blocking and delay. Let’s first see a classic deadlocking example. DL-12 : DL-21 : lock L1; lock L2; lock L2; lock L1; unlock L2; unlock L1; unlock L1; unlock L2; The methods DL-12 and DL-21 may fail to terminate because of the circular dependency of locks. This non-termination is caused by permanent blocking. That is, when DL-12 tries to acquire L2, it could be blocked if the lock has been acquired by DL-21. For a second example, the call of the dfInc method (in Fig. 1(b)) by the left thread below may never terminate. dfInc(); while (true) dfInc(); When the left thread tries to acquire the lock, even if the lock is available at that time, the thread could be preempted by the right thread, who gets the lock ahead of the left. Then the left thread would fail at the cas command in the code of dfInc and have to loop at least one more round before termination. This may happen infinitely many times, causing non-termination of the dfInc method on the left. In this case we say the progress of the left method is delayed by its environment’s successful acquirement of the lock. The key difference between blocking and delay is that blocking is caused by the absence of certain environment actions, e.g., releasing a lock, while delay is caused by the occurrence of certain environment actions, e.g., acquiring the lock needed by the current thread (even if the lock is subsequently released). In other words, a blocked thread can progress only if its environment progresses first, while a delayed thread can progress if we suspend the execution of its environment. Lock-free algorithms disallow blocking (thus they do not rely on fair scheduling), although delay is common, especially in optimistic algorithms. Starvation-free algorithms allow (limited) blocking, but not delay. As the dfInc example shows, delay from non-terminating clients may cause starvation. Deadlock-free algorithms allow both (with restrictions). As the optimistic list in Fig. 2(a) (explained in Sec. 2.3.4) shows, blocking and delay can be intertwined by the combined use of blocking-based synchronization and optimistic concurrency, which makes the reasoning significantly more challenging than reasoning about lock-free algorithms. How do we come up with general principles to allow blocking and/or delay, but on the other hand to guarantee starvation-freedom or deadlock-freedom? 2.2.2 Avoid Circular Reasoning Rely-guarantee style logics provide the power of thread-modular verification by circular reasoning. When proving the behaviors of a thread t guarantee G, we assume that the behaviors of the environment thread t 0 satisfy R. Conversely, the proof of thread t 0 relies on the assumptions on the behaviors of thread t. However, circular reasoning is usually unsound in liveness verification [1]. For instance, we could prove termination of each thread in the deadlocking example above, under the assumption that each environment thread eventually releases the lock it owns. How do we avoid the circular reasoning without sacrificing rely-guarantee style thread-modular reasoning? The deadlocking example shows that we should avoid circular reasoning to rule out circular dependency caused by blocking. Delay may also cause circular dependency too. Figure 2(b) shows a thread t using two locks. It first acquires L1 (line 1) and then tests whether L2 is available (line 2). If the test fails, the thread rolls back. It releases L1 (line 4), and then repeats the process of acquiring L1 (line 5) and testing L2 (line 6). Suppose another thread t 0 does the opposite: repeatedly acquiring L2 and testing L1. In this example the acquirement of L2 by t 0 may cause t to fail its test of the availability of L2. The test could have succeeded if t 0 did not interfere, so t 0 delays t. Conversely, the acquirement of L1 by t may delay t 0 . Then the two threads can cause each other to continually roll back, and neither method progresses. Usually when delay is allowed, we need to make sure that the action delaying other threads is a “good” one in that it makes the executing thread progress (e.g., a step towards termination). This is the case with the “benign delays” in the dfInc example and the optimistic list example. But how do we tell if an action is good or not? The acquirements of locks in Fig. 2(b) do seem to be good because they make the threads progress towards termination. How do we prevent such lock acquirements from delaying others, which may cause circular delay? 2.2.3 Ad-Hoc Synchronization and Dynamic Locks One may argue that the circularity can be avoided by simply enforcing certain orders of lock acquirements, which has been a standard way to avoid “deadlock cycles” (note this is a safety property, as we explained in Sec. 1). Although lock orders can help liveness reasoning, it has many limitations in practice. First, the approach cannot apply for ad-hoc synchronization. For instance, there are no locks in the following deadlocking program. x := 1; while (y = 1) skip; x := 0; y := 1; while (x = 1) skip; y := 0; Moreover, sometimes we need to look into the lock implementation to prove starvation-freedom. For instance, the dfInc in Fig. 1(b) using a TAS lock is deadlock-free but not starvation-free. If we replace the TAS lock with a ticket lock, as in sfInc in Fig. 1(c), the counter becomes starvation-free. Again, there are actually no locks in the programs if we have to work at a low abstraction level to look into lock implementations. Second, it can be difficult to enforce the ordering for fine-grained algorithms on dynamic data structures (e.g., lock-coupling list). Since the data structure is changing dynamically, the set of locks associated with the nodes is dynamic too. To allow a thread to determine dynamically the order of locks, we have to ensure its view of ordering is consistent with all the other threads in the system, a challenge for thread-modular verification. Although dynamic locks are supported in some previous work treating deadlockfreedom as a safety property (e.g., [4, 19]), it is unclear how to apply the techniques for general progress reasoning, with possible combination of locks, ad-hoc synchronization and rollbacks. 2.3 Our Approaches To address these problems, our logic enforces the following principles to permit restricted forms of blocking and delay, but prevent circular reasoning and non-termination. First, if a thread is blocked, the events it waits for must eventually occur. To avoid circular reasoning, we find “definite actions” of each thread, which under fair scheduling will definitely happen once enabled, regardless of the interference from the environment. Then each blocked thread needs to show it waits for only a finite number of definite actions from the environment threads. They form an acyclic queue, and there is always at least one of them enabled. This is what we call “definite progress”, which is crucial for proving starvation-freedom. Second, actions of a thread can delay others only if they are making the executing object method to move towards termination. Each object method can only execute a finite number of such delaying actions to avoid indefinite delay. This is enforced by 387
assigning a finite number of tokens to each method.A token must Definite actions. First,we introduce a novel notion called a be paid to execute a delaying action. "definite action"D.which models a thread action that,once enabled. Third,we divide actions of a thread into normal ones (which do must be eventually finished regardless of what the environment not delay others)and delaying ones,and further stratify delaying does.In detail,D is in the form of PaQd.It requires in every actions into multiple levels.When a thread is delayed by a level-k execution that Qa should eventually hold if Pa holds,and Pa should action from its environment,it is allowed to execute not only more be preserved (by both the current thread and the environment)until normal actions,but also more delaying actions at lower levels Qa holds.For sfInc,the definite action Pa d of a thread can Allowing one delaying action to trigger more steps of other delaying be defined as follows.Pa says that owner equals the thread's ticket actions is necessary for verifying algorithms with nested locks and number i,and Qa says that owner has been increased to i+1.Tha rollbacks,such as the optimistic lists in Fig.2(a).The stratification is,a thread definitely releases the lock when acquiring it.Of course prevents the circular delay in the example of Fig.2(b) we have to ensure in our logic that D is indeed definite.We will Fourth,our delaying actions and definite actions are all seman explain in detail the logic rule that enforces it in Sec.4.2.2 tically specified as part of object specifications,therefore we can support ad-hoc synchronizaiton and do not rely on built-in synchro- Definite progress.Then we use definite actions to prove termina- nization primitives to enforce ordering of events.Moreover,since tion of loops.We need to first find an assertion specifying the the specifications are all parametrized over states,they are expres- condition when the thread t can progress on its own,i.e.,it is not sive enough to support dynamic locks as in lock-coupling lists.Also blocked.Then we enforce the following principles: our "definite progress"condition allows each blocked thread to de- cide locally and dynamically a queue of definite actions it waits for. 1.If is continuously true,we need to prove the loop terminates There is no need to enforce a global ordering of blocking depen- following the idea of the TERM rule; dencies agreed by every thread.This also provides thread-modular 2.If O is false,the following must always be true: support of dynamic locks. Below we give more details about some of these key ideas. (a)There is a finite queue of definite actions of other threads that the thread t is waiting for,among which there is at least one (from a certain thread t')enabled.The length of the 2.3.1 Using Tokens to Prevent Infinite Loops queue is E. The key to ensuring termination is to require each loop to terminate (b)E decreases whenever one of these definite actions is fin- Earlier work [16,24]requires each round of the loop to consume ished; resources called tokens.The rule for loops is in the following form: (c)The expression E is never increased by any threads (no PAB→P*OR,GF{PCP matter whether O holds or not);and it is non-negative. (TERM) R,GH(P)while (B)CIPA-B) We can see E serves as a well-founded metric.By induction over E Here◇represents one token,and“*”is the normal separating we know eventually O holds,which implies the termination of the ohe one eken tain t ane loop by the above condition 1. These conditions are enforced in our new inference rule for token needs to be consumed to start this new round of loop.Since loops,which extends the TERM rule (in Sec.2.3.1)and is presented the number of tokens strictly decreases,we know the loop must in Sec.4.2.2.The condition 2 shows the use of definite actions terminate when the thread has no token. in our reasoning about progress.We call it the"definite progress" We use this simple idea to enforce termination of loops,and condition. extend it to handle blocking and delay in a concurrent setting The reasoning above implicitly makes use of the fairness as- sumption.The fair scheduling ensures that the environment thread t 2.3.2 Definite Actions and Definite Progress mentioned in the condition 2(a)is scheduled infinitely often.there- fore its definite action will definitely happen.By conditions 2(b) Our approach to cut the blocking-caused circular dependency is and 2(c)we know E will become smaller.In this way E keeps inspired by the implementation of ticket locks,which is used to decreasing until Q holds eventually. implement the starvation-free counter sfInc in Fig.1(c).It uses For sfInc,O is defined as (i owner)and the metric E is the shared variables owner and next to guarantee the first-come- first-served property of the lock.Initially owner equals next.To (i-owner).Whenever an environment thread t'finishes a definite action by releasing the lock,it increases owner,so E decreases. acquire the lock,a thread atomically increments next and reads When E is decreased to 0,the current thread is unblocked.Its loop its old value to a variable i (line 2).The value of i becomes the terminates and it succeeds in acquiring the lock thread's ticket.The thread waits until owner equals its ticket value i (line 3).Finally the lock is released by incrementing owner (line 5) such that the next waiting thread (the thread with ticket i+1.if 2.3.3 Allowing Queue Jumps for Deadlock-Free Objects there is one)can now enter the critical section. The method dfInc in Fig.1(b)implements a deadlock-free counter We can see sfInc is not concerned with the circular dependency using the TAS lock.If the current thread t waits for the lock,we problem.Intuitively the ticket lock algorithm ensures that the threads know the queue of definite actions it waits for is of length one requesting the lock always constitute a queue ti,t2,...,tn.The because it is possible for the thread to acquire the lock immediately head thread,t1.gets the ticket number which equals owner and after the lock is released.However,as we explain in Sec.2.2.1 can immediately acquire the lock.Once it releases the lock (by another thread t'may preempt t and do a successful cas.Then increasing owner),ti is dequeued.Moreover,for any thread t in thread t is blocked and waits for a queue of definite actions again this queue,the number of threads ahead of t never increases.Thus t This delay caused by thread t'can be viewed as a queue jump in our must eventually become the head of the queue and acquire the lock definite-progress-based reasoning.Actually dfInc cannot satisfy Here the dependencies among progress of the threads are in concert the definite progress requirement because we cannot find a strictly with the queue decreasing queue size E.It is not starvation-free. Following this queue principle,we explicitly specify the queue However,the queue jump here is acceptable when verifying of progress dependencies in our logic to avoid circular reasoning deadlock-freedom.This is because thread t'delays t only if t 388
assigning a finite number of tokens to each method. A token must be paid to execute a delaying action. Third, we divide actions of a thread into normal ones (which do not delay others) and delaying ones, and further stratify delaying actions into multiple levels. When a thread is delayed by a level-k action from its environment, it is allowed to execute not only more normal actions, but also more delaying actions at lower levels. Allowing one delaying action to trigger more steps of other delaying actions is necessary for verifying algorithms with nested locks and rollbacks, such as the optimistic lists in Fig. 2(a). The stratification prevents the circular delay in the example of Fig. 2(b). Fourth, our delaying actions and definite actions are all semantically specified as part of object specifications, therefore we can support ad-hoc synchronizaiton and do not rely on built-in synchronization primitives to enforce ordering of events. Moreover, since the specifications are all parametrized over states, they are expressive enough to support dynamic locks as in lock-coupling lists. Also our “definite progress” condition allows each blocked thread to decide locally and dynamically a queue of definite actions it waits for. There is no need to enforce a global ordering of blocking dependencies agreed by every thread. This also provides thread-modular support of dynamic locks. Below we give more details about some of these key ideas. 2.3.1 Using Tokens to Prevent Infinite Loops The key to ensuring termination is to require each loop to terminate. Earlier work [16, 24] requires each round of the loop to consume resources called tokens. The rule for loops is in the following form: P ∧ B ⇒ P 0 ∗ ♦ R, G ` {P 0}C{P} R, G ` {P}while (B) C{P ∧ ¬B} (TERM) Here ♦ represents one token, and “∗” is the normal separating conjunction in separation logic. The premise says the precondition P 0 of the loop body C has one less token than P, showing that one token needs to be consumed to start this new round of loop. Since the number of tokens strictly decreases, we know the loop must terminate when the thread has no token. We use this simple idea to enforce termination of loops, and extend it to handle blocking and delay in a concurrent setting. 2.3.2 Definite Actions and Definite Progress Our approach to cut the blocking-caused circular dependency is inspired by the implementation of ticket locks, which is used to implement the starvation-free counter sfInc in Fig. 1(c). It uses the shared variables owner and next to guarantee the first-comefirst-served property of the lock. Initially owner equals next. To acquire the lock, a thread atomically increments next and reads its old value to a variable i (line 2). The value of i becomes the thread’s ticket. The thread waits until owner equals its ticket value i (line 3). Finally the lock is released by incrementing owner (line 5) such that the next waiting thread (the thread with ticket i + 1, if there is one) can now enter the critical section. We can see sfInc is not concerned with the circular dependency problem. Intuitively the ticket lock algorithm ensures that the threads requesting the lock always constitute a queue t1,t2, . . . ,tn. The head thread, t1, gets the ticket number which equals owner and can immediately acquire the lock. Once it releases the lock (by increasing owner), t1 is dequeued. Moreover, for any thread t in this queue, the number of threads ahead of t never increases. Thus t must eventually become the head of the queue and acquire the lock. Here the dependencies among progress of the threads are in concert with the queue. Following this queue principle, we explicitly specify the queue of progress dependencies in our logic to avoid circular reasoning. Definite actions. First, we introduce a novel notion called a “definite action” D, which models a thread action that, once enabled, must be eventually finished regardless of what the environment does. In detail, D is in the form of Pd ❀ Qd. It requires in every execution that Qd should eventually hold if Pd holds, and Pd should be preserved (by both the current thread and the environment) until Qd holds. For sfInc, the definite action Pd ❀ Qd of a thread can be defined as follows. Pd says that owner equals the thread’s ticket number i, and Qd says that owner has been increased to i+ 1. That is, a thread definitely releases the lock when acquiring it. Of course we have to ensure in our logic that D is indeed definite. We will explain in detail the logic rule that enforces it in Sec. 4.2.2. Definite progress. Then we use definite actions to prove termination of loops. We need to first find an assertion Q specifying the condition when the thread t can progress on its own, i.e., it is not blocked. Then we enforce the following principles: 1. If Q is continuously true, we need to prove the loop terminates following the idea of the TERM rule; 2. If Q is false, the following must always be true: (a) There is a finite queue of definite actions of other threads that the thread t is waiting for, among which there is at least one (from a certain thread t 0 ) enabled. The length of the queue is E. (b) E decreases whenever one of these definite actions is finished; (c) The expression E is never increased by any threads (no matter whether Q holds or not); and it is non-negative. We can see E serves as a well-founded metric. By induction over E we know eventually Q holds, which implies the termination of the loop by the above condition 1. These conditions are enforced in our new inference rule for loops, which extends the TERM rule (in Sec. 2.3.1) and is presented in Sec. 4.2.2. The condition 2 shows the use of definite actions in our reasoning about progress. We call it the “definite progress” condition. The reasoning above implicitly makes use of the fairness assumption. The fair scheduling ensures that the environment thread t 0 mentioned in the condition 2(a) is scheduled infinitely often, therefore its definite action will definitely happen. By conditions 2(b) and 2(c) we know E will become smaller. In this way E keeps decreasing until Q holds eventually. For sfInc, Q is defined as (i = owner) and the metric E is (i − owner). Whenever an environment thread t 0 finishes a definite action by releasing the lock, it increases owner, so E decreases. When E is decreased to 0, the current thread is unblocked. Its loop terminates and it succeeds in acquiring the lock. 2.3.3 Allowing Queue Jumps for Deadlock-Free Objects The method dfInc in Fig. 1(b) implements a deadlock-free counter using the TAS lock. If the current thread t waits for the lock, we know the queue of definite actions it waits for is of length one because it is possible for the thread to acquire the lock immediately after the lock is released. However, as we explain in Sec. 2.2.1, another thread t 0 may preempt t and do a successful cas. Then thread t is blocked and waits for a queue of definite actions again. This delay caused by thread t 0 can be viewed as a queue jump in our definite-progress-based reasoning. Actually dfInc cannot satisfy the definite progress requirement because we cannot find a strictly decreasing queue size E. It is not starvation-free. However, the queue jump here is acceptable when verifying deadlock-freedom. This is because thread t 0 delays t only if t 0 388
1 local b :false,p,ci 1 lock L1; (MName)f∈String (ThrdID)t E Nat while (!b){ 2 local r :L2; (p,c):=find(e); 心 h11e(r1=0)[ (Epr)E:=x|n|E+E| lock p;lock c; unlock Li; (BExp)B ::true false E=E!B... 5 b :validate(p,c); 5 lock L1; (Instr)c:=:=E:=[E][E]:=E I print(E)1... 6 if (!b){ 6 x:=L2; 1 unlock c;unlock p; > () C ::=skip l c lx :f(E)I return E I (C) 8 心 lock L2; C;C if(B)C else C while(B)C} update(p,c,e); 9 unlock L2; (Prog)W ::skip let II in C...C 10 unlock c;unlock p; 10 unlock L1; (ODecl)Π,T={fh(r1,C1),.,fn(rn,Cn)} (a)optimistic list (b)rollback (S1ore)s,s∈ar⊥nt (Heap)h,h∈Nat一lmt Figure 2.Examples with multiple locks. (Mem)o,2:=(s,h) (PState)S::=(c;0o;...) successfully acquires the lock,which allows it to eventually finish Figure 3.The language syntax and state model. the dfInc method.Thus the system as a whole progresses Nevertheless,as explained in Sec.2.2.2,we have to make sure the queue jump(which is a special form of delay)is a"good"one. levels constitute a tuple (mk,...,m2,m).It is strictly descending We follow the token-transfer ideas [16.24]to support disci- along the dictionary order. plined queue jumps.We explicitly specify in the rely/guarantee The stratified -tokens naturally prohibit the circular delay conditions which steps could delay the progress of other threads problem discussed in Sec.2.2.2 with the object in Fig.2(b).The (jump their queues).To prohibit unlimited queue jumps without deadlock-free optimistic list in Fig.2(a)can now be verified.We making progress,we assign a finite number m of -tokens to an classify its delaying actions into two levels.Actions at level 2(the object method,and require that a thread can do at most m delaying highest level)update the list,which correspond to line 9 in Fig.2(a), actions before the method finishes. and each method can do only one such action.Lock acquirements Whenever a step of thread t'delays the progress of thread t. are classified at level 1,so a thread is allowed to roll back and we require t'to consume one -token.At the same time,thread t re-acquire the locks when its environment updates the list. could increase O-tokens so that it can loop more rounds.Besides we redefine the definite progress condition to allow the metric E 3. Programming Language (about the length of the queue)to be increased when an environment Figure 3 gives the syntax of the language.A program W consists of thread jumps the queue at the cost of a -token. multiple parallel threads sharing an object II.We say the threads are 2.3.4 Allowing Rollbacks for Optimistic Locking the clients of the object.An object declaration II is a mapping from a method name f to a pair of argument and code (method body) The ideas we just explained already support simple deadlock-free The statements C are similar to those in the simple language used objects such as dfInc in Fig.1(b),but they cannot be applied to for separation logic.The command print(E)generates exterally objects with optimistic synchronization,such as optimistic lists [13] observable events,which allows us to observe behaviors of programs. and lazy lists [11]. We assume it is used only in the code of clients.We use (C)to Figure 2(a)shows part of the optimistic list implementation. represent an atomic block in which C is executed atomically Each node of the list is associated with a TAS lock,the same lock To simplify the presentation,we make several simplifications as in Fig.1(b).A thread first traverses the list without acquiring any here.We assume there is only one concurrent object in the system locks (line 3).The traversal find returns two adjacent node pointers Each method of the object takes only one argument,and the method p and c.The thread then locks the two nodes (line 4),and calls body ends with a return E command.Also we assume there is no validate to check if the two nodes are still valid list nodes (line 5) nested method invocation.For the abstract object specification I, If validation succeeds,then the thread performs its updates(adding each method body is an atomic operation(C)(ahead of the return or removing elements)to the list (line 9).Otherwise it releases the command),and executing the code is always safe and never blocks two node locks (line 7)and restarts the traversal. The model of program states S is defined at the bottom of Fig.3 For this object,when the validation fails,a thread will release To ensure that the client code does not touch the object state,in S the locks it has acquired and roll back.Thus the thread may acquire we separate the states accessed by clients (oe)and by the object the locks for an unbounded number of times.Since each lock ()Here S may also contain auxiliary data about the control stacks acquirement will delay other threads requesting the same lock and for method calls.Execution of programs is modeled as a labeled each delaying action should consume one -token,it seems that transition system(W,S)(W',S').Here e is an event (either the thread would need an infinite number of -tokens.which we observable or not)produced by the transition.We give the small-step prohibit in the preceding subsection to ensure deadlock-freedom, transition semantics in TR [22]. even though this list object is indeed deadlock-free. Below we often write X,s and h for the notations at the abstract We generalize the token-transfer ideas to allow rollbacks in order level to distinguish from the concrete-level ones (o.s and h). to verify this kind of optimistic algorithms,but still have to be careful to avoid the circular delay caused by the "bad"rollbacks in 4. Fig.2(b),as we explain in Sec.2.2.2. Program Logic LiLi Our solution is to stratify the delaying actions.Each action is LiLi verifies the linearizability of objects by proving the method now labeled with a level k.The normal actions which cannot delay implementations refine abstract atomic operations.The top level other threads are at the lowest level 0.The -tokens are stratified judgment is in the form of D,R,GP)I<T.(The oBJ rule for accordingly.A thread can roll back and do more actions at level k this judgment is given in Fig.7 and will be explained later.)To verify only when its environment does an action at a higher level k',at the an object II,we give a corresponding object specification I(see cost of a k'-level -token.Note that the -tokens at the highest level Fig.3),which maps method names to atomic commands.In addition are strictly decreasing,which means a thread cannot roll back its we need to provide an object invariant (P)and rely/guarantee actions of the highest level.In fact,the numbers of -tokens at all conditions (R and G)for the refinement reasoning in a concurrent 389
1 local b := false, p, c; 2 while (!b) { 3 (p, c) := find(e); 4 lock p; lock c; 5 b := validate(p, c); 6 if (!b) { 7 unlock c; unlock p; } 8 } 9 update(p, c, e); 10 unlock c; unlock p; (a) optimistic list 1 lock L1; 2 local r := L2; 3 while (r != 0) { 4 unlock L1; 5 lock L1; 6 r := L2; 7 } 8 lock L2; 9 unlock L2; 10 unlock L1; (b) rollback Figure 2. Examples with multiple locks. successfully acquires the lock, which allows it to eventually finish the dfInc method. Thus the system as a whole progresses. Nevertheless, as explained in Sec. 2.2.2, we have to make sure the queue jump (which is a special form of delay) is a “good” one. We follow the token-transfer ideas [16, 24] to support disciplined queue jumps. We explicitly specify in the rely/guarantee conditions which steps could delay the progress of other threads (jump their queues). To prohibit unlimited queue jumps without making progress, we assign a finite number m of -tokens to an object method, and require that a thread can do at most m delaying actions before the method finishes. Whenever a step of thread t 0 delays the progress of thread t, we require t 0 to consume one -token. At the same time, thread t could increase ♦-tokens so that it can loop more rounds. Besides, we redefine the definite progress condition to allow the metric E (about the length of the queue) to be increased when an environment thread jumps the queue at the cost of a -token. 2.3.4 Allowing Rollbacks for Optimistic Locking The ideas we just explained already support simple deadlock-free objects such as dfInc in Fig. 1(b), but they cannot be applied to objects with optimistic synchronization, such as optimistic lists [13] and lazy lists [11]. Figure 2(a) shows part of the optimistic list implementation. Each node of the list is associated with a TAS lock, the same lock as in Fig. 1(b). A thread first traverses the list without acquiring any locks (line 3). The traversal find returns two adjacent node pointers p and c. The thread then locks the two nodes (line 4), and calls validate to check if the two nodes are still valid list nodes (line 5). If validation succeeds, then the thread performs its updates (adding or removing elements) to the list (line 9). Otherwise it releases the two node locks (line 7) and restarts the traversal. For this object, when the validation fails, a thread will release the locks it has acquired and roll back. Thus the thread may acquire the locks for an unbounded number of times. Since each lock acquirement will delay other threads requesting the same lock and each delaying action should consume one -token, it seems that the thread would need an infinite number of -tokens, which we prohibit in the preceding subsection to ensure deadlock-freedom, even though this list object is indeed deadlock-free. We generalize the token-transfer ideas to allow rollbacks in order to verify this kind of optimistic algorithms, but still have to be careful to avoid the circular delay caused by the “bad” rollbacks in Fig. 2(b), as we explain in Sec. 2.2.2. Our solution is to stratify the delaying actions. Each action is now labeled with a level k. The normal actions which cannot delay other threads are at the lowest level 0. The -tokens are stratified accordingly. A thread can roll back and do more actions at level k only when its environment does an action at a higher level k 0 , at the cost of a k 0 -level -token. Note that the -tokens at the highest level are strictly decreasing, which means a thread cannot roll back its actions of the highest level. In fact, the numbers of -tokens at all (MName) f ∈ String (ThrdID) t ∈ Nat (Expr) E ::= x | n | E + E | . . . (BExp) B ::= true | false | E = E | !B | . . . (Instr) c ::= x := E | x := [E] | [E] := E | print(E) | . . . (Stmt) C ::= skip | c | x := f(E) | return E | hCi | C; C | if (B) C else C | while (B){C} (Prog) W ::= skip | let Π in C k. . .kC (ODecl) Π, Γ ::= {f1 ❀ (x1, C1), . . . , fn ❀ (xn, Cn)} (Store) s, s ∈ Var * Int (Heap) h, h ∈ Nat * Int (Mem) σ, Σ ::= (s, h) (PState) S ::= (σc, σo, . . .) Figure 3. The language syntax and state model. levels constitute a tuple (mk, . . . , m2, m1). It is strictly descending along the dictionary order. The stratified -tokens naturally prohibit the circular delay problem discussed in Sec. 2.2.2 with the object in Fig. 2(b) . The deadlock-free optimistic list in Fig. 2(a) can now be verified. We classify its delaying actions into two levels. Actions at level 2 (the highest level) update the list, which correspond to line 9 in Fig. 2(a), and each method can do only one such action. Lock acquirements are classified at level 1, so a thread is allowed to roll back and re-acquire the locks when its environment updates the list. 3. Programming Language Figure 3 gives the syntax of the language. A program W consists of multiple parallel threads sharing an object Π. We say the threads are the clients of the object. An object declaration Π is a mapping from a method name f to a pair of argument and code (method body). The statements C are similar to those in the simple language used for separation logic. The command print(E) generates externally observable events, which allows us to observe behaviors of programs. We assume it is used only in the code of clients. We use hCi to represent an atomic block in which C is executed atomically. To simplify the presentation, we make several simplifications here. We assume there is only one concurrent object in the system. Each method of the object takes only one argument, and the method body ends with a return E command. Also we assume there is no nested method invocation. For the abstract object specification Γ, each method body is an atomic operation hCi (ahead of the return command), and executing the code is always safe and never blocks. The model of program states S is defined at the bottom of Fig. 3. To ensure that the client code does not touch the object state, in S we separate the states accessed by clients (σc) and by the object (σo). Here S may also contain auxiliary data about the control stacks for method calls. Execution of programs is modeled as a labeled transition system (W, S) e 7−→ (W0 , S 0 ). Here e is an event (either observable or not) produced by the transition. We give the small-step transition semantics in TR [22]. Below we often write Σ, s and h for the notations at the abstract level to distinguish from the concrete-level ones (σ, s and h). 4. Program Logic LiLi LiLi verifies the linearizability of objects by proving the method implementations refine abstract atomic operations. The top level judgment is in the form of D, R, G ` {P}ΠΓ. (The OBJ rule for this judgment is given in Fig. 7 and will be explained later.) To verify an object Π, we give a corresponding object specification Γ (see Fig. 3), which maps method names to atomic commands. In addition, we need to provide an object invariant (P) and rely/guarantee conditions (R and G) for the refinement reasoning in a concurrent 389
(RelAssn)P,Q,J:=B|emp|E→E|E→E pP+QI PAQ I PVQ 1... G:=(a,) (a,)y(a',)兽(a世a',∑出) where (s,h)w(s',h')(s,hwh'),if s=s' (RelAct)R.G:=PxkQI [P]I D ILGJo I GAG I GVG 1... (s,h),(s,h)=B iff [B]sws true (DAct) D :=PMQ Vr.D DAD ((s,h),(s,h))F emp iff dom(h)=dom(h)= (FullAssn) P,9 P|arem(C).|◇(E).|◆(Ek,.,E1) I Lplo I p*q I pAg I pVq l... (s,h),(s,)卡E1→E2iffh={IE]w[E2lss} (s,h),(s,h)FE1÷E2ifh={[E1lsWs+IE2lsw】 Figure 4.Syntax of the assertion language. SEP*Q iff361,G2.G=G1出G2 A(61EP)A(62EQ) setting.Here P is a relational assertion that specifies the consistency (a)semantics of relational state assertions P and Q relation between the concrete data representation and the abstract (6,6')EPKkQ iff (6EP)A(6'EQ) value.Similarly,R and G lift regular rely and guarantee conditions to the binary setting,which now specify transitions of states at both (s,G)=[P]if(6'=6)A(6=P) the concrete level and the abstract level.The definite action D is a (b)semantics of relational rely/guarantee assertions R and G special form of state transitions used for progress reasoning.The definitions of P,R,G and D are shown in Sec.4.1. (G,(u,),C)=P iff 6P Note LiLi is a logic for concurrent objects II only.We do not ((u,w),C)E arem(C)iff C=C provide logic rules for clients.See Sec.5 for more discussions. To simplify the presentation in this paper,we describe LiLi based (6,(u,w),C)上◇(E)if3n.(Els.s=n)A(n≤w) on the plain Rely-Guarantee Logic [18].Also,to avoid "variables as (6,(u,w),C)=◆(Ek,E)if(IE6.$,IE1l6.s)≤u resources"[27],we assume program variables are either thread-local ((u,w),C)EIplo i f3w.(6,(u,w),C)卡p or read-only.The full version of LiLi (see TR [22])extends the more 6卡pl iff 3u,w,C.(,(u,w),C)Fp advanced Rely-Guarantee-based logic LRG [7]to support dynamic allocation and ownership transfer.It also drops the assumption about Cwc些CifC=skip CifC”=skip program variables. (6,(u,w),C)u(G',(u,r),C)兽(su6',(u+',e+'),CuC) 4.1 Assertions (c)semantics of full assertions p and g We define assertions in Fig.4.The relational state assertions P and O specify the relationship between the concrete state o and the Figure 5.Semantics of assertions abstract state X.Here we use s and h for the store and the heap at the abstract level (see Fig.3).For simplicity,we assume the program variables used in the concrete code are different from those in the (u,w)records the numbers of various tokens and It serves abstract code (e.g.,we use x and X at the concrete and abstract levels as a well-founded metric for our progress reasoning.Informally w respectively).We use the relational state 6 to represent the pair of specifies the upper bound of the number of loop rounds that the states (o,>)as defined in Fig.5. current thread can execute if it is neither blocked nor delayed by its Figure 5(a)defines semantics of state assertions.The boolean environment.The assertion (E)says the number w of O-tokens is expression B holds if it evaluates to true at the combined store no less than E.Therefore ()always holds,and (n+1)implies of s and s.emp describes empty heaps.The assertion EE2 (n)for any n.We postpone the explanation of u and the assertion specifies a singleton heap at the concrete level with the value of (E,...,E1)to Sec.4.3.Below we use as the shorthand for the expression E2 stored at the location E.Its counterpart for (1).We use po to ignore the descriptions in p about the number an abstract level heap is represented as E E2.Semantics of of -tokens.p converts p back to a relational state assertion. separating conjunction P*Q is similar as in separation logic,except Separating conjunction p g has the standard meaning as in that it is now lifted to relational states 6.The disjoint union of two separation logic,which says p and g hold over disjoint parts of relational states is defined at the top of the figure.Semantics of the ((u,w),C)respectively (the formal definition elided here).The assertion p will be defined latter (see Fig.5(c)). disjoint union is defined in Fig.5(c).The disjoint union of the Rely/guarantee assertions R and G specify the transitions over numbers of tokens is the sum of them.The disjoint union of C the relational states 6.Their semantics is defined in Fig.5(b).The and C2 is defined only if at least one of them is skip.Therefore we action P says that the initial relational states satisfy P and the know the following holds (for any P and C): resulting states satisfy Q.We can ignore the index k for now,which is used to stratify actions that may delay the progress of other threads (P A arem(C)A)*(A emp)(P A arem(C)A(2)) and will be explained in Sec.4.3.[P]specifies identity transitions with the initial states satisfying P.Semantics of Go is defined Definite actions.Fig.4 also defines definite actions D,which in Sec.4.3 too (see Fig.10).Below we use P Q as a shorthand can be treated as special forms of rely/guarantee conditions.Their for P o Q.We also use ld for [true],which represents arbitrary semantics is given in Fig.6(a).PO specifies the transitions identity transitions. where the final states satisfy Q if the initial states satisfy P.It is We further instrument the relational state assertions with the different from P x O in that PO does not restrict the transitions specifications of the abstract level code and various tokens.The if initially P does not hold.Consider the following example D. resulting full assertions p and g are defined in Fig.4,whose semantics is given in Fig.5(c).The assertion p is interpreted Dx兰n.(x→n)A(n>0)》(x→n+1) over ((u,w),C).C is the abstract-level code that remains to D.describes the state transitions which increment x if x is positive be refined.It is specified by the assertion arem(C).Since our initially.It is satisfied by any transitions where initially x is not logic verifies linearizability of objects,C is always in the form positive.The conjunction DiA D2 is useful for enumerating definite of atomic commands(C)(ahead of return commands).The pair actions.For instance,when the program uses two locks L1 and L2, 390
(RelAssn) P, Q, J ::= B | emp | E 7→ E | E Z⇒ E | TpU | P ∗ Q | P ∧ Q | P ∨ Q | . . . (RelAct) R, G ::= P nk Q | [P] | D | bGc0 | G ∧ G | G ∨ G | . . . (DAct) D ::= P ❀ Q | ∀x. D | D ∧ D (FullAssn) p, q ::= P | arem(C) | ♦(E) | (Ek, . . . , E1) | bpc♦ | p ∗ q | p ∧ q | p ∨ q | . . . Figure 4. Syntax of the assertion language. setting. Here P is a relational assertion that specifies the consistency relation between the concrete data representation and the abstract value. Similarly, R and G lift regular rely and guarantee conditions to the binary setting, which now specify transitions of states at both the concrete level and the abstract level. The definite action D is a special form of state transitions used for progress reasoning. The definitions of P, R, G and D are shown in Sec. 4.1. Note LiLi is a logic for concurrent objects Π only. We do not provide logic rules for clients. See Sec. 5 for more discussions. To simplify the presentation in this paper, we describe LiLi based on the plain Rely-Guarantee Logic [18]. Also, to avoid “variables as resources” [27], we assume program variables are either thread-local or read-only. The full version of LiLi (see TR [22]) extends the more advanced Rely-Guarantee-based logic LRG [7] to support dynamic allocation and ownership transfer. It also drops the assumption about program variables. 4.1 Assertions We define assertions in Fig. 4. The relational state assertions P and Q specify the relationship between the concrete state σ and the abstract state Σ. Here we use s and h for the store and the heap at the abstract level (see Fig. 3). For simplicity, we assume the program variables used in the concrete code are different from those in the abstract code (e.g., we use x and X at the concrete and abstract levels respectively). We use the relational state S to represent the pair of states (σ, Σ), as defined in Fig. 5. Figure 5(a) defines semantics of state assertions. The boolean expression B holds if it evaluates to true at the combined store of s and s. emp describes empty heaps. The assertion E1 7→ E2 specifies a singleton heap at the concrete level with the value of the expression E2 stored at the location E1. Its counterpart for an abstract level heap is represented as E1 Z⇒ E2. Semantics of separating conjunction P ∗Q is similar as in separation logic, except that it is now lifted to relational states S. The disjoint union of two relational states is defined at the top of the figure. Semantics of the assertion TpU will be defined latter (see Fig. 5(c)). Rely/guarantee assertions R and G specify the transitions over the relational states S. Their semantics is defined in Fig. 5(b). The action P nk Q says that the initial relational states satisfy P and the resulting states satisfy Q. We can ignore the index k for now, which is used to stratify actions that may delay the progress of other threads and will be explained in Sec. 4.3. [P] specifies identity transitions with the initial states satisfying P. Semantics of bGc0 is defined in Sec. 4.3 too (see Fig. 10). Below we use P n Q as a shorthand for P n0 Q. We also use Id for [true], which represents arbitrary identity transitions. We further instrument the relational state assertions with the specifications of the abstract level code and various tokens. The resulting full assertions p and q are defined in Fig. 4, whose semantics is given in Fig. 5(c). The assertion p is interpreted over (S,(u, w), C). C is the abstract-level code that remains to be refined. It is specified by the assertion arem(C). Since our logic verifies linearizability of objects, C is always in the form of atomic commands hC 0 i (ahead of return commands). The pair S ::= (σ, Σ) (σ, Σ) ] (σ 0 , Σ0 ) def = (σ ] σ 0 , Σ ] Σ0 ) where (s, h)](s 0 , h0 ) def = (s, h]h 0 ) , if s=s 0 ((s, h), (s, h)) |= B iff JBKs]s = true ((s, h), (s, h)) |= emp iff dom(h) = dom(h) = ∅ ((s, h), (s, h)) |= E1 7→ E2 iff h = {JE1Ks]s ❀ JE2Ks]s} ((s, h), (s, h)) |= E1 Z⇒ E2 iff h = {JE1Ks]s ❀ JE2Ks]s} S |= P ∗ Q iff ∃S1, S2. S = S1 ] S2 ∧(S1 |= P) ∧ (S2 |= Q) (a) semantics of relational state assertions P and Q (S, S0 ) |= P nk Q iff (S |= P) ∧ (S0 |= Q) (S, S0 ) |= [P] iff (S0 = S) ∧ (S |= P) (b) semantics of relational rely/guarantee assertions R and G (S, (u, w), C) |= P iff S |= P (S, (u, w), C) |= arem(C0 ) iff C = C0 (S, (u, w), C) |= ♦(E) iff ∃n. (JEKS.s = n) ∧ (n ≤ w) (S, (u, w), C) |= (Ek, . . . , E1) iff (JEkKS.s, . . . , JE1KS.s) ≤ u (S, (u, w), C) |= bpc♦ iff ∃w0 . (S, (u, w0 ), C) |= p S |= TpU iff ∃u, w, C. (S, (u, w), C) |= p C ] C0 def = C0 if C = skip C if C0 = skip (S, (u, w), C)](S0 , (u 0 , w0 ), C0 ) def = (S]S0 , (u+u 0 , w+w0 ), C]C0 ) (c) semantics of full assertions p and q Figure 5. Semantics of assertions. (u, w) records the numbers of various tokens and ♦. It serves as a well-founded metric for our progress reasoning. Informally w specifies the upper bound of the number of loop rounds that the current thread can execute if it is neither blocked nor delayed by its environment. The assertion ♦(E) says the number w of ♦-tokens is no less than E. Therefore ♦(0) always holds, and ♦(n + 1) implies ♦(n) for any n. We postpone the explanation of u and the assertion (Ek, . . . , E1) to Sec. 4.3. Below we use ♦ as the shorthand for ♦(1). We use bpc♦ to ignore the descriptions in p about the number of ♦-tokens. TpU converts p back to a relational state assertion. Separating conjunction p ∗ q has the standard meaning as in separation logic, which says p and q hold over disjoint parts of (S,(u, w), C) respectively (the formal definition elided here). The disjoint union is defined in Fig. 5(c). The disjoint union of the numbers of tokens is the sum of them. The disjoint union of C1 and C2 is defined only if at least one of them is skip. Therefore we know the following holds (for any P and C): (P ∧ arem(C) ∧ ♦) ∗ (♦ ∧ emp) ⇔ (P ∧ arem(C) ∧ ♦(2)) Definite actions. Fig. 4 also defines definite actions D, which can be treated as special forms of rely/guarantee conditions. Their semantics is given in Fig. 6(a). P ❀ Q specifies the transitions where the final states satisfy Q if the initial states satisfy P. It is different from P nQ in that P ❀ Q does not restrict the transitions if initially P does not hold. Consider the following example Dx. Dx def = ∀n. ((x 7→ n) ∧ (n > 0)) ❀ (x 7→ n + 1) Dx describes the state transitions which increment x if x is positive initially. It is satisfied by any transitions where initially x is not positive. The conjunction D1∧D2 is useful for enumerating definite actions. For instance, when the program uses two locks L1 and L2, 390
(6,6')EPQ iff (6EP)(6Q) The last three premises of the OBJ rule check the well-formedness (6.6')EVr.D iff Vn.(Sfnl.s'f~n))D of the specifications.The first one says the guarantee of one (6,6')ED1A D2 iff ((6,6')FD1)A((6,6')D2) thread must imply the rely of all others,a standard requirement in rely/guarantee reasoning.In Fig.8 we give a simplified definition of (a)semantics of D wffAct used in the second premise.Its complete definition is given in Fig.11,which will be explained later when we introduce stratified Enabled(P,Q)些p actions and -tokens.wffAct(R,D)says once a definite action D Enabled(Vr.D) 些3x.Enabled(D) of a thread t is enabled it cannot be disabled by an environment step Enabled(D1 A D2) Enabled(D1)V Enabled(D2) in R.Also such an environment step either fulfills a definite action D.,of some thread t'different from t,or preserves Enabled(D.) (D)些 DA(Enabled(D)x true) too.Together with the previous premise GR,this condition [D]e implies Vt'.G D VD.Therefore,once D:is enabled,the Enabled(D)Enabled(D) only way to disable it is to let the thread t finish the action.As an D'≤Diff(Enabled(D→Enabled(D)A(D→D') example,consider the following D: (b)useful syntactic sugars D些(亿→t)(L→0) Figure 6.Semantics of definite actions It says that whenever a thread t acquires the lock L,it will finally release the lock.Then,wffAct(R,D)requires that when t acquires L,every step in the system either keeps L unchanged or releases L the definite action D of the whole program is usually in the form of In particular,Re keeps L unchanged,that is,the environment cannot D1 A D2,where Di and D2 specify L1 and L2 respectively. update the lock when Lt. We define some useful syntactic sugars in Fig.6(b).The state The last premise (P-Enabled(D))says there cannot be assertion Enabled(D)takes the guard condition of D.We use (D) enabled but unfinished definite actions when the method terminates to represent the state transitions of D when it is enabled at the initial and the object invariant P is true. state.Intuitively (D)gives us the corresponding""actions.For The judgment D,R,Gp A arem(C)}Cfq A arem(skip)} instance,(PQ)is equivalent to P Q.For the example D establishes a simulation relation between C and C',which ensures defined above,(D)is equivalent to the following: the preservation of termination when the environment guarantees the definite action D.It also ensures the execution of C guarantees D 3m.(x→n)A(n>0)×(x→n+1) too.We explain the key rules for the judgment below.The complete In addition,we define the syntactic sugar D as a definite action set of rules are presented in TR [22]. describing the preservation of Enabled(D).For the example D above,[D]represents the following definite action: 4.2.2 The WHL Rule for Loops (白n.(x→n)A(n>0)+(3n.(x+n)A(n>0) The WHL rule,shown in Fig.7,is the most important rule of the logic.It establishes both of the following properties of the loop: It specifies the transitions which keep x positive if it is positive initially.The notation D'D will be explained later in Sec.4.2.2 (1)it cannot loop forever with D continuously enabled; Since D is a special rely/guarantee assertion,the semantics of (2)it cannot loop forever unless the current thread is waiting for D D'follows the standard meaning of R=R[7](or see some definite actions of its environment. the definition in Fig.10). The former guarantees that a definite action of the current thread Thread IDs as implicit assertion parameters. All the assertions will definitely happen once it is enabled.The latter is crucial to in our logic,including state assertions,rely/guarantee conditions establish the starvation-freedom. and definite actions,are implicitly parametrized over thread IDs Why definite actions are definite.The WHL verifies the loop Although our logic does modular reasoning about the object code without any knowledge about clients,it is useful for assertions to body with a precondition p',which can be derived from the loop refer to thread IDs.For instance,we may use L>t to represent invariant p if B holds.Moreover,we require each iteration to that the lock L is acquired by the thread t.We use P to represent the consume a O-token if Enabled(D)holds at the beginning,as shown instantiation of the thread ID parameter in P with t,which means by the second premise (ignore the assertion O for now).Since each P holds on thread t.Then P alone can also be understood as Vt.P thread has only a finite number of O-tokens,the loop must terminate and P=Q can be viewed as Vt.PQt.The same convention if Enabled(D)is continuously true. applies to rely/guarantee conditions and definite actions. However,the last premise of the oBJ rule says Enabled(D) cannot be true when the method terminates.Therefore,Enabled(D) 4.2 Verifying Starvation-Freedom with Definite Actions cannot be continuously true.Also recall the other two side conditions (wffAct(R,D)and GR)of the oBJ rule guarantee that,once Figure 7 presents the inference rules of LiLi.We explain the logic in Enabled(D)holds,the only way to make it false is to let the current two steps.In this subsection we explain the use of definite actions thread finish the action. to reason about starvation-freedom.Then we explain the delay Putting all these together,we know D will be finished once it is mechanism for deadlock-freedom in Sec.4.3. enabled,even with the interference R. 4.2.1 The OBJ Rule Starvation-freedom.To establish starvation-freedom.we need to The oBJ rule requires that each method in II refine its atomic find a condition Q saying the current thread is not blocked by others specification in I.Starting from the initial concrete and abstract Then the second premise requires each iteration to consume a O- object states related by P,and with the equivalent method arguments token if Q holds at the beginning.Since the number of tokens is c and y at the concrete and the abstract levels,the method body C finite,the loop must terminate if Q always holds. must fulfill the abstract atomic operation C.We can temporarily If is false,the current thread is blocked by others,and it does ignore the assertion(E&,...,E1)for deadlock-freedom not need to consume -tokens.Then the premise(R,G:D'Q) 391
(S, S0 ) |= P ❀ Q iff (S |= P) =⇒ (S0 |= Q) (S, S0 ) |= ∀x. D iff ∀n. (S{x ❀ n}, S0{x ❀ n}) |= D (S, S0 ) |= D1 ∧ D2 iff ((S, S0 ) |= D1) ∧ ((S, S0 ) |= D2) (a) semantics of D Enabled(P ❀ Q) def = P Enabled(∀x. D) def = ∃x. Enabled(D) Enabled(D1 ∧ D2) def = Enabled(D1) ∨ Enabled(D2) hDi def = D ∧ (Enabled(D) n true) [D] def = Enabled(D) ❀ Enabled(D) D0 6 D iff (Enabled(D0 ) ⇒ Enabled(D)) ∧ (D ⇒ D0 ) (b) useful syntactic sugars Figure 6. Semantics of definite actions. the definite action D of the whole program is usually in the form of D1 ∧ D2, where D1 and D2 specify L1 and L2 respectively. We define some useful syntactic sugars in Fig. 6(b). The state assertion Enabled(D) takes the guard condition of D. We use hDi to represent the state transitions of D when it is enabled at the initial state. Intuitively hDi gives us the corresponding “n” actions. For instance, hP ❀ Qi is equivalent to P n Q. For the example Dx defined above, hDxi is equivalent to the following: ∃n. ((x 7→ n) ∧ (n > 0)) n (x 7→ n + 1) In addition, we define the syntactic sugar [D] as a definite action describing the preservation of Enabled(D). For the example Dx above, [Dx] represents the following definite action: (∃n. (x 7→ n) ∧ (n > 0)) ❀ (∃n. (x 7→ n) ∧ (n > 0)) It specifies the transitions which keep x positive if it is positive initially. The notation D 0 6 D will be explained later in Sec. 4.2.2. Since D is a special rely/guarantee assertion, the semantics of D ⇒ D0 follows the standard meaning of R ⇒ R 0 [7] (or see the definition in Fig. 10). Thread IDs as implicit assertion parameters. All the assertions in our logic, including state assertions, rely/guarantee conditions and definite actions, are implicitly parametrized over thread IDs. Although our logic does modular reasoning about the object code without any knowledge about clients, it is useful for assertions to refer to thread IDs. For instance, we may use L 7→ t to represent that the lock L is acquired by the thread t. We use Pt to represent the instantiation of the thread ID parameter in P with t, which means P holds on thread t. Then P alone can also be understood as ∀t.Pt, and P ⇒ Q can be viewed as ∀t.Pt ⇒ Qt. The same convention applies to rely/guarantee conditions and definite actions. 4.2 Verifying Starvation-Freedom with Definite Actions Figure 7 presents the inference rules of LiLi. We explain the logic in two steps. In this subsection we explain the use of definite actions to reason about starvation-freedom. Then we explain the delay mechanism for deadlock-freedom in Sec. 4.3. 4.2.1 The OBJ Rule The OBJ rule requires that each method in Π refine its atomic specification in Γ. Starting from the initial concrete and abstract object states related by P, and with the equivalent method arguments x and y at the concrete and the abstract levels, the method body C must fulfill the abstract atomic operation C 0 . We can temporarily ignore the assertion (Ek, . . . , E1) for deadlock-freedom. The last three premises of the OBJ rule check the well-formedness of the specifications. The first one says the guarantee of one thread must imply the rely of all others, a standard requirement in rely/guarantee reasoning. In Fig. 8 we give a simplified definition of wffAct used in the second premise. Its complete definition is given in Fig. 11, which will be explained later when we introduce stratified actions and -tokens. wffAct(R, D) says once a definite action Dt of a thread t is enabled it cannot be disabled by an environment step in Rt. Also such an environment step either fulfills a definite action Dt 0 of some thread t 0 different from t, or preserves Enabled(Dt 0 ) too. Together with the previous premise Gt 0 ⇒ Rt, this condition implies ∀t 0 . Gt 0 ⇒ [Dt 0 ] ∨ Dt 0 . Therefore, once Dt is enabled, the only way to disable it is to let the thread t finish the action. As an example, consider the following Dt: Dt def = (L 7→ t) ❀ (L 7→ 0) It says that whenever a thread t acquires the lock L, it will finally release the lock. Then, wffAct(R, D) requires that when t acquires L, every step in the system either keeps L unchanged or releases L. In particular, Rt keeps L unchanged, that is, the environment cannot update the lock when L 7→ t. The last premise (P ⇒ ¬Enabled(D)) says there cannot be enabled but unfinished definite actions when the method terminates and the object invariant P is true. The judgment D, R, G ` {p ∧ arem(C 0 )}C{q ∧ arem(skip)} establishes a simulation relation between C and C 0 , which ensures the preservation of termination when the environment guarantees the definite action D. It also ensures the execution of C guarantees D too. We explain the key rules for the judgment below. The complete set of rules are presented in TR [22]. 4.2.2 The WHL Rule for Loops The WHL rule, shown in Fig. 7, is the most important rule of the logic. It establishes both of the following properties of the loop: (1) it cannot loop forever with D continuously enabled; (2) it cannot loop forever unless the current thread is waiting for some definite actions of its environment. The former guarantees that a definite action of the current thread will definitely happen once it is enabled. The latter is crucial to establish the starvation-freedom. Why definite actions are definite. The WHL verifies the loop body with a precondition p 0 , which can be derived from the loop invariant p if B holds. Moreover, we require each iteration to consume a ♦-token if Enabled(D) holds at the beginning, as shown by the second premise (ignore the assertion Q for now). Since each thread has only a finite number of ♦-tokens, the loop must terminate if Enabled(D) is continuously true. However, the last premise of the OBJ rule says Enabled(D) cannot be true when the method terminates. Therefore, Enabled(D) cannot be continuously true. Also recall the other two side conditions (wffAct(R, D) and Gt ⇒ Rt 0 ) of the OBJ rule guarantee that, once Enabled(D) holds, the only way to make it false is to let the current thread finish the action. Putting all these together, we know D will be finished once it is enabled, even with the interference R. Starvation-freedom. To establish starvation-freedom, we need to find a condition Q saying the current thread is not blocked by others. Then the second premise requires each iteration to consume a ♦- token if Q holds at the beginning. Since the number of tokens is finite, the loop must terminate if Q always holds. If Q is false, the current thread is blocked by others, and it does not need to consume ♦-tokens. Then the premise (R, G: D 0 f −→Q) 391
for all f∈dom(Π):Π(f)=(x,C)(f)=(g,C) dom(Π①=dom(T) D,R,G(PA (=y)A arem(C)A(Ek,...E1))C(P A arem(skip)} t,t.t≠t→Gt台R wffAct(R,D) P→-Enabled(D) (OBJ) D,R,GF{P}ΠT 卫AB→p pA BA (Enabled(D)VQ)p(A emp)D.R.GHp')CIp p→J Sta(J,RVGD≤D wffAct((R,D)J÷(R,G:DQ) D,R,GHIP)Cig -(WHL) (HIDE-◇) D,R,GF (p}while(B)CHpA-B] D,R,GH(Lplo}C(lglo} FpC[q]q台kq (P×kLq)→G D,Id,G(p)(C){q}Sta({p.9},R) (ATOM) (ATOM-R) D,Id,GH[p)(C)g} D,R,GH(p)(C)a} Figure 7.Inference rules. requires the thread be waiting for its environment to perform a finite wffAct(R,D)ifft.R→[DeA(t≠t.D]VDe) number of definite actions. Sta(p,R)iff ,,u,w,C. Definition 1 (Definite Progress).(R,G:DQ)iff the (6,(uw),C)=p)A(6,6)F)→(6',(,w),C)Fp following hold for all t: p台9ift,o,∑,u,w,C,∑e ((o,),(u,w),C)=p)A(E⊥F)→3C, (1)either =Qt ((C,EWEF)(C','WF))A((a,>),(u,w),C)q or there exists t'such that t't and Enabled(D); (2)for any t't and 6',if (,',0)RA(D).then Figure 8.Auxiliary defs.used in logic rules (simplified version). f(G)<f(G): (3)for any 6',if(6,6,0)ERVGt,then f()<f(). sequential separation logic.The corresponding rules are standard Here f is a function that maps the relational states to some metrics and elided here.We use pg for the zero or multiple-step over which there is a well-founded order < executions from the abstract code specified by p to the code specified Ignoring the index 0 above,the definition says either O holds by g,which is defined in Fig.8.Then,the ATOM rule allows us to over or the definite action D of some environment thread t' execute zero-or-more steps of the abstract code with the execution is enabled.Also we require the metric f()to decrease when of C.as long as the overall transition (including the abstract steps a definite action is performed.Besides,the metric should never and the concrete steps)satisfies the relational guarantee G.We can increase at any step of the execution. lift C's total correctness to the concurrent setting as long as the To ensure that the metric f decreases regardless of the time when environment consists of identity transitions only.To allow a weaker the environment's definite actions take place,the definite progress R,we can apply the ATOM-R rule later,which requires that the pre- should always hold.This is enforced by finding a stronger assertion and post-conditions be stable with respect to R. J such that p=J and Sta(,RVG)hold,that is,J is an invariant 4.2.4 Example:Ticket Locks that holds at every program point of the loop.If(R,G:DQ) We prove the starvation-freedom of the ticket lock implementation happens to satisfy the two premises,we can use(R,G:DQ) in Fig.9 using our logic rules.We have informally discussed in directly as,but in practice it could be easier to prove the stability Sec.2 the verification of the counter using a ticket lock(sfInc in requirement by finding a stronger/.The definition of stability Fig.1(c)).To simplify the presentation,here we erase the increment Sta(p,R)is given in Fig.8. in the critical section and focus on the progress property of the code Notice in (R,G:D'Q)we can use D'instead of D to in Fig.9.With an empty critical section,the code functions just as simplify the proof,as long as D'D and wffAct(R,D')are skip,so Fig.9 proves it is linearizable with respect to skip.The satisfied.The premise D'D.defined in Fig.6.says D'specifies proofs of sfInc (including its starvation-freedom and linearizability a subset of definite actions in D.For instance,if D consists of with respect to the atomic INC in Fig.1(a))are given in TR [22]. multiple definite actions and is in the form of DA...ADD' To help specify the queue of the threads requesting the lock,we may contain only a subset of these D(1 k n).The way to introduce an auxiliary array ticket.As shown in Fig.9,each array exclude in D'irrelevant definite actions can simplify the proof of cell ticketi records the ID of the unique thread which gets the the condition(2)of definite progress(see Definition 1). ticket number i.Here we use cid for the current thread ID. Given the definite progress condition,we know o will be We then define some predicates to describe the lock status. eventually true because each definite action will definitely happen. lock(tl,n,n2)contains the auxiliary ticket array in addition to Then the loop starts to consume O-tokens and needs to finally owner and next,where ownerni and nextn2,and tl is terminate,following our argument at the beginning. the list of the threads recorded in ticket[ni],...,ticket[n2-1]. We also use locked(tl,n,n2)for the case when tl is not empty. 4.2.3 More Inference Rules That is,the lock is acquired by the first thread in tl,while the The HIDE-O rule allows us to discard the tokens (by using) other threads in tl are waiting for the lock in order.Besides,we use when the termination of code C is already established,which is locklrre(tl,n1,n2)short for lock(tl,n,n2)A (t).That is,the useful for modular verification of nested loops. thread t is"irrelevant"to the lock:it does not request the lock.The formal definitions are given in TR [22]. ATOM rules for refinement reasoning.The ATOM rule allows us The bottom of Fig.9 defines the precondition P and the guaran- to logically execute the abstract code simultaneously with every tee condition G of the code.Gt specifies the possible atomic actions concrete step (let's first ignore the index k in the premises of the of a thread t.Lock:adds the thread t at the end of rl of the threads rule).We use[p]C[g]to represent the total correctness of C in requesting the lock and increments next.It corresponds to line 2 392
for all f ∈ dom(Π) : Π(f) = (x, C) Γ(f) = (y, C0 ) dom(Π) = dom(Γ) D, R, G ` {P ∧ (x = y) ∧ arem(C0 ) ∧ (Ek, . . . , E1)}C{P ∧ arem(skip)} ∀t,t 0 . t 6= t 0 =⇒ Gt ⇒ Rt 0 wffAct(R, D) P ⇒ ¬Enabled(D) D, R, G ` {P}ΠΓ (OBJ) p ∧ B ⇒ p 0 p ∧ B ∧ (Enabled(D) ∨ Q) ⇒ p 0 ∗ (♦ ∧ emp) D, R, G ` {p 0}C{p} p ⇒ J Sta(J, R ∨ G) D0 6 D wffAct(R, D0 ) J ⇒ (R, G: D0 f −→Q) D, R, G ` {p}while (B){C}{p ∧ ¬B} (WHL) D, R, G ` {p}C{q} D, R, G ` {bpc♦}C{bqc♦} (HIDE-♦) ` [p]C[q 0 ] q 0 Vk q (TpU nk TqU) ⇒ G D, Id, G ` {p}hCi{q} (ATOM) D, Id, G ` {p}hCi{q} Sta({p, q}, R) D, R, G ` {p}hCi{q} (ATOM-R) Figure 7. Inference rules. requires the thread be waiting for its environment to perform a finite number of definite actions. Definition 1 (Definite Progress). S |= (R, G : D f −→ Q) iff the following hold for all t: (1) either S |= Qt, or there exists t 0 such that t 0 6= t and S |= Enabled(Dt 0 ); (2) for any t 0 6= t and S 0 , if (S, S 0 , 0) |= Rt ∧ hDt 0i, then ft(S 0 ) < ft(S); (3) for any S 0 , if (S, S 0 , 0) |= Rt ∨ Gt, then ft(S 0 ) ≤ ft(S). Here f is a function that maps the relational states S to some metrics over which there is a well-founded order <. Ignoring the index 0 above, the definition says either Q holds over S or the definite action Dt 0 of some environment thread t 0 is enabled. Also we require the metric f(S) to decrease when a definite action is performed. Besides, the metric should never increase at any step of the execution. To ensure that the metric f decreases regardless of the time when the environment’s definite actions take place, the definite progress should always hold. This is enforced by finding a stronger assertion J such that p ⇒ J and Sta(J, R∨ G) hold, that is, J is an invariant that holds at every program point of the loop. If (R, G : D f −→ Q) happens to satisfy the two premises, we can use (R, G : D f −→ Q) directly as J, but in practice it could be easier to prove the stability requirement by finding a stronger J. The definition of stability Sta(p, R) is given in Fig. 8. Notice in (R, G : D 0 f −→ Q) we can use D 0 instead of D to simplify the proof, as long as D 0 6 D and wffAct(R, D 0 ) are satisfied. The premise D 0 6 D, defined in Fig. 6, says D 0 specifies a subset of definite actions in D. For instance, if D consists of multiple definite actions and is in the form of D1 ∧ . . . ∧ Dn, D 0 may contain only a subset of these Dk (1 ≤ k ≤ n). The way to exclude in D 0 irrelevant definite actions can simplify the proof of the condition (2) of definite progress (see Definition 1). Given the definite progress condition, we know Q will be eventually true because each definite action will definitely happen. Then the loop starts to consume ♦-tokens and needs to finally terminate, following our argument at the beginning. 4.2.3 More Inference Rules The HIDE-♦ rule allows us to discard the tokens (by using b c♦) when the termination of code C is already established, which is useful for modular verification of nested loops. ATOM rules for refinement reasoning. The ATOM rule allows us to logically execute the abstract code simultaneously with every concrete step (let’s first ignore the index k in the premises of the rule). We use ` [p]C[q] to represent the total correctness of C in wffAct(R, D) iff ∀t. Rt ⇒ [Dt] ∧ (∀t 0 6= t. [Dt 0 ] ∨ Dt 0 ) Sta(p, R) iff ∀S, S0 , u, w, C. ((S, (u, w), C) |= p) ∧ ((S, S0 ) |= R) =⇒ (S0 , (u, w), C) |= p p V q iff ∀t, σ, Σ, u, w, C, ΣF . (((σ, Σ), (u, w), C) |= p) ∧ (Σ⊥ΣF ) =⇒ ∃C0 , Σ0 . ((C, Σ]ΣF ) −_∗ t (C0 , Σ0]ΣF )) ∧ ((σ, Σ0 ), (u, w), C0 ) |=q Figure 8. Auxiliary defs. used in logic rules (simplified version). sequential separation logic. The corresponding rules are standard and elided here. We use p V q for the zero or multiple-step executions from the abstract code specified by p to the code specified by q, which is defined in Fig. 8. Then, the ATOM rule allows us to execute zero-or-more steps of the abstract code with the execution of C, as long as the overall transition (including the abstract steps and the concrete steps) satisfies the relational guarantee G. We can lift C’s total correctness to the concurrent setting as long as the environment consists of identity transitions only. To allow a weaker R, we can apply the ATOM-R rule later, which requires that the preand post-conditions be stable with respect to R. 4.2.4 Example: Ticket Locks We prove the starvation-freedom of the ticket lock implementation in Fig. 9 using our logic rules. We have informally discussed in Sec. 2 the verification of the counter using a ticket lock (sfInc in Fig. 1(c)). To simplify the presentation, here we erase the increment in the critical section and focus on the progress property of the code in Fig. 9. With an empty critical section, the code functions just as skip, so Fig. 9 proves it is linearizable with respect to skip. The proofs of sfInc (including its starvation-freedom and linearizability with respect to the atomic INC in Fig. 1(a)) are given in TR [22]. To help specify the queue of the threads requesting the lock, we introduce an auxiliary array ticket. As shown in Fig. 9, each array cell ticket[i] records the ID of the unique thread which gets the ticket number i. Here we use cid for the current thread ID. We then define some predicates to describe the lock status. lock(tl, n1, n2) contains the auxiliary ticket array in addition to owner and next, where owner 7→ n1 and next 7→ n2, and tl is the list of the threads recorded in ticket[n1], . . . , ticket[n2 − 1]. We also use locked(tl, n1, n2) for the case when tl is not empty. That is, the lock is acquired by the first thread in tl, while the other threads in tl are waiting for the lock in order. Besides, we use lockIrrt(tl, n1, n2) short for lock(tl, n1, n2) ∧ (t 6∈ tl). That is, the thread t is “irrelevant” to the lock: it does not request the lock. The formal definitions are given in TR [22]. The bottom of Fig. 9 defines the precondition P and the guarantee condition G of the code. Gt specifies the possible atomic actions of a thread t. Lockt adds the thread t at the end of tl of the threads requesting the lock and increments next. It corresponds to line 2 392
11oca11,0,r; 了k if(G,G)=P×kQ 2 C((6,),PxkQ) maxL otherwise 3 o :[owner];while (i !o){o :[owner]; c((6.6'),[P]) 智 0 if(6,6)卡[] 4 [owner]:i +1; maxL otherwise c((6,6'),D) 鸣 0 if(6.6')ED P3t,n1:n2.locklrrt(tl,n1,n2)Gt Lockt V Unlockt maxL otherwise Lockt,n1,n2.locklrrt(tl,n1,n2)locked(tl::t,n1,n2+1) C((,'),RAR')max(C((,G'),R),C((,G'),R')) Unlock:,n1,n2.locked(t::tl,n1,n2)x locklrrt(,n1+1,n2) C((6,'),RVR)min(C((6,G),R),C((6,G'),R)) DVil,n1:n2.locked(t:n1:n2)locklrrt(tl,n+1,n2) c((,6'),LRJo) 0 if c((6.6).R)=0 k些3m1,n2,tlh1,2.tlocked,(n1,i,n2A(o≤n1) maxL otherwise Qt3n2,tl2.locked(t::112,i,n2)A (oi) (6,6)LRJo iff C((,'),R)=0 f()=k iff 6 3n1.(ownern1)*true A(i-n1 =k) (,)R iff C((),R)=k andkk.(m=n)》A(nk<nk) to line 4 of Fig.9.The rely condition R includes all the G made (nm,,n)≈k(nm,,n1)if(i≥k.(=ni)》 by the environment threads t' u<u'iff 3k.u <k u' u≤u'iff u<u'Vu=u Next we define the definite action D.As we explained in Sec.2,D.requires that whenever the thread t holds a lock with (u,w)<k (u',w')iff (u <k u')v(k=0Au=u'Aw=w') ownern,it should eventually release the lock by incrementing (u,w)≈k(u',w)ifu≈kA(k=0→w=w) owner to n+1.We can prove the side conditions about well- formedness of specifications in the oBJ rule hold. Figure 10.Levels of state transitions and tokens. The key to verifying the loop at line 3 is to find a metric function f and prove definite progressJ(R,G':DQ)for a To interpret the level numbers in the assertion semantics,we de- stableJ.As shown in Fig.9,we define/to say that the thread fine C((,)R)in Fig.10 which assigns a level to the transition t is requesting the lock.Here tlockedt.(n,i,n2)is similar (')given the specification R.That is,if C((,)R)=k. to locked(th::t::tl2,n1,n2).It also says that the thread t takes we say R views (as a level-k transition.We let k maxL if the ticket number i.specifies the case when th is empty (thus the transition does not satisfy R.Given the level function,we can owneri).We also strengthen the guarantee condition G of the now define the semantics ofRo,which picks out the transitions loop to ld,the identity transitions. that R views as level-0 ones.For the following example R, The metric function f maps each state to the difference between i and owner at that state,which describes the number R些(PxoQ)V(P'x1Q) of threads ahead of t in the waiting queue.We use the usual<order Ro is equivalent to P xo Q.Besides,R=Ro means that R on natural numbers as the associated well-founded order.Then,we views all state transitions as level-0 ones,thus any state transitions can verifyJ→(R,G':D∠Q). of R should not delay the progress of other threads. Finally,we prove that the loop terminates with one -token when We use (',k)ER as a shorthand for C((,)R)=k O holds or D is enabled.Then we can conclude linearizability and (k<maxL).Then the implication RR'is redefined under this starvation-freedom of the ticket lock implementation in Fig.9. new interpretation,as shown in Fig.10. -tokens in assertions.To ensure the progress of the whole 4.3 Adding Delay for Deadlock-Free Objects system,we require the steps of delaying actions to pay -tokens. As we explained in Sec.2,deadlock-free objects allow the progress Since we allow multi-levels of transitions to delay other threads, of a thread to be delayed by its environment,as long as the whole the -tokens are stratified accordingly.Thus we introduce the new system makes progress.Correspondingly,to verify deadlock-free assertion (Ex,...,E1)in Fig.4,whose semantics is defined in objects,we extend our logic with a delay mechanism.First we find Fig.5.It says the number of each level-j -tokens is no less than out the delaying actions and stratify them for objects with rollbacks Ei.Here u is a sequence (nk,...,n)recording the numbers of where a delaying action may trigger more steps of other delaying tokens at different levels,as defined in Fig.10.We overload as the actions.Then,we introduce -tokens (we usehere to distinguish dictionary order for the sequence of natural numbers.The ordering them from -tokens for loops)to bound the number of delaying over u and other related definitions are also given in Fig.10. actions in each method,so we avoid infinite delays without whole- 4.3.1 Inference Rules Revisited system progress. To use the logic to verify deadlock-free objects,we need to first Multi-level rely/guarantee assertions.As shown in Fig.4,we find in each object method the actions that may delay the progress index the rely/guarantee assertion P Q with a natural number k of others.Since some of these actions may be further delayed by and call it a level-k action.We require 0<k<maxL,where maxL others,we assign levels to them to ensure each action can only be is a predefined upper bound of all levels.Intuitively,P could delayed by ones at higher levels.We specify the actions and their make other threads do more actions at a level k'<k.Thus P oQ levels in the rely/guarantee conditions.We also need to decide an cannot make other threads do any more actions,i.e.,it cannot delay upper bound of these execution steps at each level and specify them other threads.P1Q could make other threads do more actions as the number of -tokens in the precondition of each method. at level 0 but no more at level 1,thus we avoid the problem of Below we revisit the inference rules in Fig.7 and explain their use delay-caused circular dependency discussed in Sec.2.2.2. of multi-level actions and -tokens.In the OBJ rule,we specify in the 393
1 local i, o, r; 2 ; 3 o := [owner]; while (i != o) { o := [owner]; } 4 [owner] := i + 1; Pt def = ∃tl, n1, n2. lockIrrt(tl, n1, n2) Gt def = Lockt ∨ Unlockt Lockt def = ∃tl, n1, n2. lockIrrt(tl, n1, n2) n locked(tl::t, n1, n2 + 1) Unlockt def = ∃tl, n1, n2. locked(t::tl, n1, n2) n lockIrrt(tl, n1+1, n2) Dt def = ∀tl, n1, n2. locked(t::tl, n1, n2) ❀ lockIrrt(tl, n1 + 1, n2) Jt def = ∃n1, n2, tl1, tl2. tlockedtl1,t,tl2 (n1, i, n2) ∧ (o ≤ n1) Qt def = ∃n2, tl2. locked(t::tl2, i, n2) ∧ (o ≤ i) f(S) = k iff S |= ∃n1. (owner 7→ n1) ∗ true ∧ (i − n1 = k) Figure 9. Proofs for the ticket lock (with auxiliary code in gray). of Fig. 9. Unlockt releases the lock by incrementing owner and dequeuing the thread t which currently holds the lock. It corresponds to line 4 of Fig. 9. The rely condition Rt includes all the Gt 0 made by the environment threads t 0 . Next we define the definite action D. As we explained in Sec. 2, Dt requires that whenever the thread t holds a lock with owner 7→ n1, it should eventually release the lock by incrementing owner to n1 + 1. We can prove the side conditions about wellformedness of specifications in the OBJ rule hold. The key to verifying the loop at line 3 is to find a metric function f and prove definite progress J ⇒ (R, G0 : D f −→Q) for a stable J. As shown in Fig. 9, we define Jt to say that the thread t is requesting the lock. Here tlockedtl1,t,tl2 (n1, i, n2) is similar to locked(tl1 :: t :: tl2, n1, n2). It also says that the thread t takes the ticket number i. Qt specifies the case when tl1 is empty (thus owner 7→ i). We also strengthen the guarantee condition G 0 of the loop to Id, the identity transitions. The metric function f maps each state S to the difference between i and owner at that state, which describes the number of threads ahead of t in the waiting queue. We use the usual k. (n 0 i = ni)) ∧ (n 0 k < nk) (n 0m, . . . , n0 1 ) ≈k (nm, . . . , n1) iff (∀i ≥ k. (n 0 i = ni)) u < u0 iff ∃k. u <k u 0 u ≤ u 0 iff u < u0 ∨ u = u 0 (u, w) <k (u 0 , w0 ) iff (u <k u 0 ) ∨ (k = 0 ∧ u = u 0 ∧ w = w0 ) (u, w) ≈k (u 0 , w0 ) iff u ≈k u 0 ∧ (k = 0 =⇒ w = w0 ) Figure 10. Levels of state transitions and tokens. To interpret the level numbers in the assertion semantics, we define L((S, S 0 ), R) in Fig. 10 which assigns a level to the transition (S, S 0 ), given the specification R. That is, if L((S, S 0 ), R) = k, we say R views (S, S 0 ) as a level-k transition. We let k = maxL if the transition does not satisfy R. Given the level function, we can now define the semantics of bRc0, which picks out the transitions that R views as level-0 ones. For the following example R, R def = (P n0 Q) ∨ (P 0 n1 Q0 ) bRc0 is equivalent to P n0 Q. Besides, R ⇒ bRc0 means that R views all state transitions as level-0 ones, thus any state transitions of R should not delay the progress of other threads. We use (S, S 0 , k) |= R as a shorthand for L((S, S 0 ), R) = k (k < maxL). Then the implication R ⇒ R 0 is redefined under this new interpretation, as shown in Fig. 10. -tokens in assertions. To ensure the progress of the whole system, we require the steps of delaying actions to pay -tokens. Since we allow multi-levels of transitions to delay other threads, the -tokens are stratified accordingly. Thus we introduce the new assertion (Ek, . . . , E1) in Fig. 4, whose semantics is defined in Fig. 5. It says the number of each level-j -tokens is no less than Ej . Here u is a sequence (nk, . . . , n1) recording the numbers of - tokens at different levels, as defined in Fig. 10. We overload < as the dictionary order for the sequence of natural numbers. The ordering over u and other related definitions are also given in Fig. 10. 4.3.1 Inference Rules Revisited To use the logic to verify deadlock-free objects, we need to first find in each object method the actions that may delay the progress of others. Since some of these actions may be further delayed by others, we assign levels to them to ensure each action can only be delayed by ones at higher levels. We specify the actions and their levels in the rely/guarantee conditions. We also need to decide an upper bound of these execution steps at each level and specify them as the number of -tokens in the precondition of each method. Below we revisit the inference rules in Fig. 7 and explain their use of multi-level actions and -tokens. In the OBJ rule, we specify in the 393
wffAct(R,D)ift.lR」o→D]A(t≠t.[D]VD】 locked:兰(亿→t)envLocked,些3t'.locked,A(t'≠t) p言k9ift,a,∑,u,地,C,p. (((o,),(u,w),C)p)A(ELEr)=3u',w',C,E/. unlocked (L0)notOwnt unlocked V envLockedt ((C.EuEr)(C','))A (((a'),(u',w'),C')Fq) GtLockt V Unlockt A(,世)1)action threads trying to acquire the lock.The Unlockt action is at level 0, which cannot delay other threads.Also the precondition is given a from its environment thread t',thread t could gain more O-tokens to do more loop iterations.It could also gain more level-k'('1 (i.e.,it is right branch of p.Thus p is already stable. a delaying action).In this case,although the current thread may be The current thread pays its -token when its cas at line 3 blocked for a longer time,the whole system must progress since a succeeds (i.e.,it acquires the lock),as shown in the left branch -token is paid by the environment thread for the delaying action. of the assertion after line 3.If the cas fails,the thread still has to Also the requirement wffAct(R.D)(used at the OBJ rule and acquire the lock in the future and to try one more iteration. the WHL rule)should be revised to allow queue jumps.The new definition is shown in Fig.11.Here we allow a queue jump to disable 4.3.3 Another Example:Nested Locks with Rollback the definite action D of the thread at the head of the queue,so it is not necessary to require Enabled(D)to be preserved when the To demonstrate the use of action levels,we verify the rollback code in Fig.2(b)which we informally discussed in Sec.2.Here we environment step is associated with level k>1. assume all the methods of the object either acquire L1 before L2(as 4.3.2 Example:Test-and-Set Locks in the method of Fig.2(b)),or acquire only one lock. In Fig.12,we verify the test-and-set lock implementation explained Stratified delaying actions.As in the TAS lock example in in Sec.2.Like the ticket lock proofs in Sec.4.2.4,we simplify the Sec.4.3.2,lock acquirements are delaying actions.Here we have 394
wffAct(R, D) iff ∀t. bRtc0 ⇒ [Dt] ∧ (∀t 0 6= t. [Dt 0 ] ∨ Dt 0 ) p Vk q iff ∀t, σ, Σ, u, w, C, ΣF . (((σ, Σ), (u, w), C) |= p) ∧ (Σ⊥ΣF ) =⇒ ∃u 0 , w0 , C0 , Σ0 . ((C, Σ]ΣF ) −_∗ t (C0 , Σ0]ΣF )) ∧ (((σ, Σ0 ), (u 0 , w0 ), C0 ) |= q) ∧ (u 0 , w0 ) <k (u, w) (<k defined in Fig. 10) Sta(p, R) iff ∀S, S0 , u, w, C, k. ((S, (u, w), C) |= p) ∧ ((S, S0 , k) |= R) =⇒ ∃u 0 , w0 . ((S0 , (u 0 , w0 ), C) |= p) ∧ ((u 0 , w0 ) ≈k (u, w)) where ≈k is defined in Fig. 10. Figure 11. Key auxiliary definitions for inference rules (final version that supersedes definitions in Fig. 8). precondition the number of -tokens needed for each object method. The side condition wffAct(R, D) is also redefined in Fig. 11, which is explained below. Decreasing -tokens at the ATOM rule. The thread loses - tokens when it performs an action that may delay other threads. This is required by the ATOM rule. Depending on whether the atomic command may delay others or not, we assign a level k in the premise q 0 Vk q, which is redefined in Fig. 11. Similar to p V q in Fig. 8, it allows us to execute the abstract code. Now it also requires the number of -tokens at level k to be decreased if k ≥ 1. Note k cannot be arbitrarily chosen. The assignment of the level k to the atomic operation must be consistent with the level specification in G, as required by the third premise. Being delayed: increasing tokens at stability checking. When the progress of the thread t is delayed by a level-k (k ≥ 1) action from its environment thread t 0 , thread t could gain more ♦-tokens to do more loop iterations. It could also gain more level-k 0 (k 0 < k) -tokens to execute more level-k 0 actions. Here increasing tokens would not affect the soundness of our logic because the environment thread t 0 must pay a higher-level token for its higher-level delaying action. As we explained in Sec. 2.3.4, the -tokens at all levels in the whole system actually form a tuple which strictly descends along the dictionary order, ensuring the whole-system progress. We re-define the stability Sta(p, R) in Fig. 11 to reflect the possible increasing of tokens for the thread t. We could reset w and the number at level k 0 < k in u after the environment step R if this step is associated with a level k (k ≥ 1). Allowing queue jumps at definite progress and wffAct. As we explained in Sec. 2.3.3, for deadlock-free objects, the environment steps could cause queue jumps to delay the progress of the thread t. Like starvation-free objects, the thread t using deadlock-free objects may wait for a queue of definite actions made by its environment. A queue jump would make the thread t wait for a longer queue of the environment’s definite actions. As shown in Definition 1, the definite progress condition (R, G: D f −→Q) allows the thread to reset its metric f(S) for a queue jump when the environment step is associated with level k ≥ 1 (i.e., it is a delaying action). In this case, although the current thread may be blocked for a longer time, the whole system must progress since a -token is paid by the environment thread for the delaying action. Also the requirement wffAct(R, D) (used at the OBJ rule and the WHL rule) should be revised to allow queue jumps. The new definition is shown in Fig. 11. Here we allow a queue jump to disable the definite action D of the thread at the head of the queue, so it is not necessary to require Enabled(D) to be preserved when the environment step is associated with level k ≥ 1. 4.3.2 Example: Test-and-Set Locks In Fig. 12, we verify the test-and-set lock implementation explained in Sec. 2. Like the ticket lock proofs in Sec. 4.2.4, we simplify the lockedt def = (L 7→ t) envLockedt def = ∃t 0 . lockedt 0 ∧ (t 0 6= t) unlocked def = (L 7→ 0) notOwnt def = unlocked ∨ envLockedt Gt def = Lockt ∨ Unlockt Lockt def = unlocked n1 lockedt Unlockt def = lockedt n0 unlocked Dt def = lockedt ❀ unlocked Jt def = notOwnt ∨ lockedt Qt def = unlocked ∨ lockedt ft(S) = 1 if S |= envLockedt 0 if S |= Qt notOwncid ∧ 1 local b := false; ((¬b) ∧ notOwncid ∧ ∧ ♦) ∨ (b ∧ lockedcid) 2 while (!b) { (unlocked ∧ ) ∨ (envLockedcid ∧ ∧ ♦) 3 b := cas(L, 0, cid); (b ∧ lockedcid) ∨ ((¬b) ∧ notOwncid ∧ ∧ ♦) 4 } lockedcid 5 [L] := 0; notOwncid Figure 12. Proofs for the TAS lock. code and prove it is linearizable with respect to skip. Here we omit the assertion arem(skip) at each line in the proof, and focus on proving deadlock-freedom of the code. As defined at the top of Fig. 12, the action Lockt (corresponding to the successful cas at line 3) is at level 1, which may delay other threads trying to acquire the lock. The Unlockt action is at level 0, which cannot delay other threads. Also the precondition is given a -token, which is required to pay for the Lockt action. The definite action D simply says that the thread t would eventually release the lock when it acquires the lock. It is easy to check that the side conditions about R, G and D in the OBJ rule, e.g., wffAct(R, D), are satisfied. R, G: D f −→Q specifies the queue of definite actions which now contains at most one environment thread. That is, the metric f(S) is 1 if the lock is not available, and is 0 otherwise. When an environment thread t 0 cuts in line by acquiring the lock when the lock is free, the current thread t has to wait for Dt 0 before t itself progresses. Thus in R, G: D f −→Q the current thread t can reset its metric f(S) when its environment acquires the lock. The detailed proof at the bottom of Fig. 12 shows the changes of tokens. We give the current thread one ♦-token (using the HIDE-♦ rule) to do its loop at lines 2-4. It consumes this ♦-token at the beginning of the loop body when Q holds, as shown in the left branch of the assertion p before line 3. When Q does not hold, as shown in p’s right branch, the loop does not consume the ♦-token. Next we stabilize p. For the left branch, if an environment thread t 0 acquires the lock, which is a delaying action Lockt 0 , we let the current thread regain a ♦-token. The resulting state just satisfies the right branch of p. Thus p is already stable. The current thread pays its -token when its cas at line 3 succeeds (i.e., it acquires the lock), as shown in the left branch of the assertion after line 3. If the cas fails, the thread still has to acquire the lock in the future and ♦ to try one more iteration. 4.3.3 Another Example: Nested Locks with Rollback To demonstrate the use of action levels, we verify the rollback code in Fig. 2(b) which we informally discussed in Sec. 2. Here we assume all the methods of the object either acquire L1 before L2 (as in the method of Fig. 2(b)), or acquire only one lock. Stratified delaying actions. As in the TAS lock example in Sec. 4.3.2, lock acquirements are delaying actions. Here we have 394