bca0●o0aoa0gg ++ Concurrency IN ACTION Practical Multithreading Anthony Williams MANNING
MANNING Anthony Williams Practical Multithreading IN ACTION
brief contents 1■Hello,.world of concurrency in C+!1 2·Managing threads 15 3Sharing data between threads 33 4Synchronizing concurrent operations 67 5The C++memory model and operations on atomic types 103 6Designing lock-based concurrent data structures 148 7Designing lock-free concurrent data structures 180 8Designing concurrent code 224 9Advanced thread management 273 10Testing and debugging multithreaded applications 300 vii
vii brief contents 1 ■ Hello, world of concurrency in C++! 1 2 ■ Managing threads 15 3 ■ Sharing data between threads 33 4 ■ Synchronizing concurrent operations 67 5 ■ The C++ memory model and operations on atomic types 103 6 ■ Designing lock-based concurrent data structures 148 7 ■ Designing lock-free concurrent data structures 180 8 ■ Designing concurrent code 224 9 ■ Advanced thread management 273 10 ■ Testing and debugging multithreaded applications 300
contents preface xv acknowledgments xvii about this book xix about the cover illustration xxii Hello,world of concurrency in C++!I 1.1 What is concurrency?2 Concurrency in computer systems 2 Approaches to concurrency 4 1.2 Why use concurrency?6 Using concurrency for separation of concerns 6 Using concurrency for performance 7.When not to use concurrency 8 1.3 Concurrency and multithreading in C++9 History of multithreading in C++10.Concurrency support in the new standard 10"Efficiency in the C++ Thread Library 11 Platform-specific facilities 12 1.4 Getting started 13 Hello,Concurrent World 13 1.5 Summary 14 ix
ix contents preface xv acknowledgments xvii about this book xix about the cover illustration xxii 1 Hello, world of concurrency in C++! 1 1.1 What is concurrency? 2 Concurrency in computer systems 2 Approaches to concurrency 4 1.2 Why use concurrency? 6 Using concurrency for separation of concerns 6 Using concurrency for performance 7 ■ When not to use concurrency 8 1.3 Concurrency and multithreading in C++ 9 History of multithreading in C++ 10 ■ Concurrency support in the new standard 10 ■ Efficiency in the C++ Thread Library 11 ■ Platform-specific facilities 12 1.4 Getting started 13 Hello, Concurrent World 13 1.5 Summary 14
CONTENTS 2 Managing threads 15 2.1 Basic thread management 16 Launching a thread 16"Waiting for a thread to complete 18 Waiting in exceptional circumstances 19"Running threads in the background 21 2.2 Passing arguments to a thread function 23 2.3 Transferring ownership of a thread 25 2.4 Choosing the number of threads at runtime 28 2.5 Identifying threads 31 2.6 Summary 32 3 Sharing data between threads 33 3.1 Problems with sharing data between threads 34 Race conditions 35"Avoiding problematic race conditions 36 3.2 Protecting shared data with mutexes 37 Using mutexes in C++38"Structuring code for protecting shared data 39 Spotting race conditions inherent in interfaces 40"Deadlock:the problem and a solution 47 Further guidelines for avoiding deadlock 49 Flexible locking with std::unique lock 54Transferring mutex ownership between scopes 55"Locking at an appropriate granularity 57 3.3 Alternative facilities for protecting shared data 59 Protecting shared data during initialization 59 Protecting rarely updated data structures 63"Recursive locking 64 3.4 Summary 65 4 Synchronizing concurrent operations 67 4.1 Waiting for an event or other condition 68 Waiting for a condition with condition variables 69 Building a thread-safe queue with condition variables 71 4.2 Waiting for one-off events with futures 76 Returning values from background tasks 77 Associating a task with a future 79.Making (std::)promises 81.Saving an exception for the future 83"Waiting from multiple threads 85 4.3 Waiting with a time limit 87 Clocks87·Durations88·Time points89 Functions that accept timeouts 91
x CONTENTS 2 Managing threads 15 2.1 Basic thread management 16 Launching a thread 16 ■ Waiting for a thread to complete 18 Waiting in exceptional circumstances 19 ■ Running threads in the background 21 2.2 Passing arguments to a thread function 23 2.3 Transferring ownership of a thread 25 2.4 Choosing the number of threads at runtime 28 2.5 Identifying threads 31 2.6 Summary 32 3 Sharing data between threads 33 3.1 Problems with sharing data between threads 34 Race conditions 35 ■ Avoiding problematic race conditions 36 3.2 Protecting shared data with mutexes 37 Using mutexes in C++ 38 ■ Structuring code for protecting shared data 39 ■ Spotting race conditions inherent in interfaces 40 ■ Deadlock: the problem and a solution 47 Further guidelines for avoiding deadlock 49 ■ Flexible locking with std::unique_lock 54 ■ Transferring mutex ownership between scopes 55 ■ Locking at an appropriate granularity 57 3.3 Alternative facilities for protecting shared data 59 Protecting shared data during initialization 59 ■ Protecting rarely updated data structures 63 ■ Recursive locking 64 3.4 Summary 65 4 Synchronizing concurrent operations 67 4.1 Waiting for an event or other condition 68 Waiting for a condition with condition variables 69 Building a thread-safe queue with condition variables 71 4.2 Waiting for one-off events with futures 76 Returning values from background tasks 77 ■ Associating a task with a future 79 ■ Making (std::)promises 81 ■ Saving an exception for the future 83 ■ Waiting from multiple threads 85 4.3 Waiting with a time limit 87 Clocks 87 ■ Durations 88 ■ Time points 89 Functions that accept timeouts 91
CONTENTS 4.4 Using synchronization of operations to simplify code 93 Functional programming with futures 93 Synchronizing operations with message passing 97 4.5 Summary 102 5 The C++memory model and operations on atomic types 103 5.1 Memory model basics 104 Objects and memory locations 104"Objects,memory locations, and concurrency 105.Modification orders 106 5.2 Atomic operations and types in C++107 The standard atomic types 107 Operations on std::atomic_flag 110"Operations on std::atomic112 Operations on std::atomic:pointer arithmetic 114 Operations on standard atomic integral types 116 The std::atomic<>primary class template 116 Free functions for atomic operations 117 5.3 Synchronizing operations and enforcing ordering 119 The synchronizes-with relationship 121The happens-before relationship 122Memory ordering for atomic operations 123 Release sequences and synchronizes-with 141"Fences 143 Ordering nonatomic operations with atomics 145 5.4 Summary 147 Designing lock-based concurrent data structures 148 6.1 What does it mean to design for concurrency?149 Guidelines for designing data structures for concurrency 149 6.2 Lock-based concurrent data structures 151 A thread-safe stack using locks 151"A thread-safe queue using locks and condition variables 154"A thread-safe queue using fine-grained locks and condition variables 158 6.3 Designing more complex lock-based data structures 169 Writing a thread-safe lookup table using locks 169"Writing a thread-safe list using locks 175 6.4 Summary 179 7 Designing lock-free concurrent data structures 180 7.1 Definitions and consequences 181 Types of nonblocking data structures 181"Lock-free data structures 182"Wait-free data structures 182 The pros and cons of lock-free data structures 183
CONTENTS xi 4.4 Using synchronization of operations to simplify code 93 Functional programming with futures 93 ■ Synchronizing operations with message passing 97 4.5 Summary 102 5 The C++ memory model and operations on atomic types 103 5.1 Memory model basics 104 Objects and memory locations 104 ■ Objects, memory locations, and concurrency 105 ■ Modification orders 106 5.2 Atomic operations and types in C++ 107 The standard atomic types 107 ■ Operations on std::atomic_flag 110 ■ Operations on std::atomic 112 Operations on std::atomic: pointer arithmetic 114 Operations on standard atomic integral types 116 The std::atomic<> primary class template 116 ■ Free functions for atomic operations 117 5.3 Synchronizing operations and enforcing ordering 119 The synchronizes-with relationship 121 ■ The happens-before relationship 122 ■ Memory ordering for atomic operations 123 Release sequences and synchronizes-with 141 ■ Fences 143 Ordering nonatomic operations with atomics 145 5.4 Summary 147 6 Designing lock-based concurrent data structures 148 6.1 What does it mean to design for concurrency? 149 Guidelines for designing data structures for concurrency 149 6.2 Lock-based concurrent data structures 151 A thread-safe stack using locks 151 ■ A thread-safe queue using locks and condition variables 154 ■ A thread-safe queue using fine-grained locks and condition variables 158 6.3 Designing more complex lock-based data structures 169 Writing a thread-safe lookup table using locks 169 ■ Writing a thread-safe list using locks 175 6.4 Summary 179 7 Designing lock-free concurrent data structures 180 7.1 Definitions and consequences 181 Types of nonblocking data structures 181 ■ Lock-free data structures 182 ■ Wait-free data structures 182 The pros and cons of lock-free data structures 183
xii CONTENTS 7.2 Examples of lock-free data structures 184 Writing a thread-safe stack without locks 184"Stopping those pesky leaks:managing memory in lock-free data structures 188 Detecting nodes that can't be reclaimed using hazard pointers 193 Detecting nodes in use with reference counting 200"Applying the memory model to the lock-free stack 205 Writing a thread-safe queue without locks 209 7.3 Guidelines for writing lock-free data structures 221 Guideline:use std::memoryorder seq_cst for prototyping 221 Guideline:use a lock-free memory reclamation scheme 221 Guideline:watch out for the ABA problem 222 Guideline:identify busy-wait loops and help the other thread 222 7.4 Summary 223 Designing concurrent code 224 8.1 Techniques for dividing work between threads 225 Dividing data between threads before processing begins 226 Dividing data recursively 227"Dividing work by task type 231 8.2 Factors affecting the performance of concurrent code 233 How many processors?234Data contention and cache ping-pong235·False sharing237·How close is your data?238·Oversubscription and excessive task switching 239 8.3 Designing data structures for multithreaded performance 239 Dividing array elements for complex operations 240 Data access patterns in other data structures 242 8.4 Additional considerations when designing for concurrency 243 Exception safety in parallel algorithms 243 Scalability and Amdahl's law 250Hiding latency with multiple threads 252 Improving responsiveness with concurrency 253 8.5 Designing concurrent code in practice 255 A parallel implementation of std::for_each 255"A parallel implementation of std::find 257"A parallel implementation of std::partial sum 263 8.6 Summary 272
xii CONTENTS 7.2 Examples of lock-free data structures 184 Writing a thread-safe stack without locks 184 ■ Stopping those pesky leaks: managing memory in lock-free data structures 188 Detecting nodes that can’t be reclaimed using hazard pointers 193 Detecting nodes in use with reference counting 200 ■ Applying the memory model to the lock-free stack 205 ■ Writing a thread-safe queue without locks 209 7.3 Guidelines for writing lock-free data structures 221 Guideline: use std::memory_order_seq_cst for prototyping 221 Guideline: use a lock-free memory reclamation scheme 221 Guideline: watch out for the ABA problem 222 Guideline: identify busy-wait loops and help the other thread 222 7.4 Summary 223 8 Designing concurrent code 224 8.1 Techniques for dividing work between threads 225 Dividing data between threads before processing begins 226 Dividing data recursively 227 ■ Dividing work by task type 231 8.2 Factors affecting the performance of concurrent code 233 How many processors? 234 ■ Data contention and cache ping-pong 235 ■ False sharing 237 ■ How close is your data? 238 ■ Oversubscription and excessive task switching 239 8.3 Designing data structures for multithreaded performance 239 Dividing array elements for complex operations 240 Data access patterns in other data structures 242 8.4 Additional considerations when designing for concurrency 243 Exception safety in parallel algorithms 243 ■ Scalability and Amdahl’s law 250 ■ Hiding latency with multiple threads 252 Improving responsiveness with concurrency 253 8.5 Designing concurrent code in practice 255 A parallel implementation of std::for_each 255 ■ A parallel implementation of std::find 257 ■ A parallel implementation of std::partial_sum 263 8.6 Summary 272
CONTENTS xi进 Advanced thread management 273 9.1 Thread pools 274 The simplest possible thread pool 274"Waiting for tasks submitted to a thread pool 276"Tasks that wait for other tasks 280 Avoiding contention on the work queue 283 Work stealing 284 9.2 Interrupting threads 289 Launching and interrupting another thread 289.Detecting that a thread has been interrupted 291"Interrupting a condition variable wait 291 Interrupting a wait on std::condition_variable any 294"Interrupting other blocking calls 296 Handling interruptions 297 Interrupting background tasks on application exit 298 9.3 Summary 299 10 Testing and debugging multithreaded applications 300 10.1 Types of concurrency-related bugs 301 Unwanted blocking 301 Race conditions 302 10.2 Techniques for locating concurrency-related bugs 303 Reviewing code to locate potential bugs 303 Locating concurrency-related bugs by testing 305 Designing for testability 307"Multithreaded testing techniques 308 Structuring multithreaded test code 311 Testing the performance of multithreaded code 314 10.3 Summary 314 appendix A Brief reference for some C++11 language features 315 appendix B Brief comparison of concurrency libraries 340 appendix C A message-passing framework and complete ATM example 342 appendix D C++Thread Library reference 360 resources 487 index 489
CONTENTS xiii 9 Advanced thread management 273 9.1 Thread pools 274 The simplest possible thread pool 274 ■ Waiting for tasks submitted to a thread pool 276 ■ Tasks that wait for other tasks 280 ■ Avoiding contention on the work queue 283 Work stealing 284 9.2 Interrupting threads 289 Launching and interrupting another thread 289 ■ Detecting that a thread has been interrupted 291 ■ Interrupting a condition variable wait 291 ■ Interrupting a wait on std::condition_variable_any 294 ■ Interrupting other blocking calls 296 ■ Handling interruptions 297 Interrupting background tasks on application exit 298 9.3 Summary 299 10 Testing and debugging multithreaded applications 300 10.1 Types of concurrency-related bugs 301 Unwanted blocking 301 ■ Race conditions 302 10.2 Techniques for locating concurrency-related bugs 303 Reviewing code to locate potential bugs 303 Locating concurrency-related bugs by testing 305 Designing for testability 307 ■ Multithreaded testing techniques 308 ■ Structuring multithreaded test code 311 Testing the performance of multithreaded code 314 10.3 Summary 314 appendix A Brief reference for some C++11 language features 315 appendix B Brief comparison of concurrency libraries 340 appendix C A message-passing framework and complete ATM example 342 appendix D C++ Thread Library reference 360 resources 487 index 489
preface I encountered the concept of multithreaded code while working at my first job after I left college.We were writing a data processing application that had to populate a data- base with incoming data records.There was a lot of data,but each record was inde- pendent and required a reasonable amount of processing before it could be inserted into the database.To take full advantage of the power of our 10-CPU UltraSPARC,we ran the code in multiple threads,each thread processing its own set of incoming records.We wrote the code in C++,using POSIX threads,and made a fair number of mistakes-multithreading was new to all of us-but we got there in the end.It was also while working on this project that I first became aware of the C++Standards Commit- tee and the freshly published C++Standard. I have had a keen interest in multithreading and concurrency ever since.Where others saw it as difficult,complex,and a source of problems,I saw it as a powerful tool that could enable your code to take advantage of the available hardware to run faster. Later on I would learn how it could be used to improve the responsiveness and perfor- mance of applications even on single-core hardware,by using multiple threads to hide the latency of time-consuming operations such as I/O.I also learned how it worked at the OS level and how Intel CPUs handled task switching. Meanwhile,my interest in C++brought me in contact with the ACCU and then the C++Standards panel at BSI,as well as Boost.I followed the initial development of the Boost Thread Library with interest,and when it was abandoned by the original developer,I jumped at the chance to get involved.I have been the primary developer and maintainer of the Boost Thread Library ever since. XV
xv preface I encountered the concept of multithreaded code while working at my first job after I left college. We were writing a data processing application that had to populate a database with incoming data records. There was a lot of data, but each record was independent and required a reasonable amount of processing before it could be inserted into the database. To take full advantage of the power of our 10-CPU UltraSPARC, we ran the code in multiple threads, each thread processing its own set of incoming records. We wrote the code in C++, using POSIX threads, and made a fair number of mistakes—multithreading was new to all of us—but we got there in the end. It was also while working on this project that I first became aware of the C++ Standards Committee and the freshly published C++ Standard. I have had a keen interest in multithreading and concurrency ever since. Where others saw it as difficult, complex, and a source of problems, I saw it as a powerful tool that could enable your code to take advantage of the available hardware to run faster. Later on I would learn how it could be used to improve the responsiveness and performance of applications even on single-core hardware, by using multiple threads to hide the latency of time-consuming operations such as I/O. I also learned how it worked at the OS level and how Intel CPUs handled task switching. Meanwhile, my interest in C++ brought me in contact with the ACCU and then the C++ Standards panel at BSI, as well as Boost. I followed the initial development of the Boost Thread Library with interest, and when it was abandoned by the original developer, I jumped at the chance to get involved. I have been the primary developer and maintainer of the Boost Thread Library ever since
xvi PREFACE As the work of the C++Standards Committee shifted from fixing defects in the exist- ing standard to writing proposals for the next standard(named C++Ox in the hope that it would be finished by 2009,and now officially C++11,because it was finally pub- lished in 2011),I got more involved with BSI and started drafting proposals of my own. Once it became clear that multithreading was on the agenda,I jumped in with both feet and authored or coauthored many of the multithreading and concurrency- related proposals that shaped this part of the new standard.I feel privileged to have had the opportunity to combine two of my major computer-related interests-C++ and multithreading-in this way. This book draws on all my experience with both C++and multithreading and aims to teach other C++developers how to use the C++11 Thread Library safely and effi- ciently.I also hope to impart some of my enthusiasm for the subject along the way
xvi PREFACE As the work of the C++ Standards Committee shifted from fixing defects in the existing standard to writing proposals for the next standard (named C++0x in the hope that it would be finished by 2009, and now officially C++11, because it was finally published in 2011), I got more involved with BSI and started drafting proposals of my own. Once it became clear that multithreading was on the agenda, I jumped in with both feet and authored or coauthored many of the multithreading and concurrencyrelated proposals that shaped this part of the new standard. I feel privileged to have had the opportunity to combine two of my major computer-related interests—C++ and multithreading—in this way. This book draws on all my experience with both C++ and multithreading and aims to teach other C++ developers how to use the C++11 Thread Library safely and efficiently. I also hope to impart some of my enthusiasm for the subject along the way
acknowledgments I will start by saying a big "Thank you"to my wife,Kim,for all the love and support she has given me while writing this book.It has occupied a significant part of my spare time for the last four years,and without her patience,support,and understanding,I couldn't have managed it. Second,I would like to thank the team at Manning who have made this book possi- ble:Marjan Bace,publisher;Michael Stephens,associate publisher;Cynthia Kane,my development editor;Karen Tegtmeyer,review editor;Linda Recktenwald,my copy- editor;Katie Tennant,my proofreader;and Mary Piergies,the production manager. Without their efforts you would not be reading this book right now. I would also like to thank the other members of the C++Standards Committee who wrote committee papers on the multithreading facilities:Andrei Alexandrescu, Pete Becker,Bob Blainer,Hans Boehm,Beman Dawes,Lawrence Crowl,Peter Dimov, Jeff Garland,Kevlin Henney,Howard Hinnant,Ben Hutchings,Jan Kristofferson,Doug Lea,Paul McKenney,Nick McLaren,Clark Nelson,Bill Pugh,Raul Silvera,Herb Sutter, Detlef Vollmann,and Michael Wong,plus all those who commented on the papers,dis- cussed them at the committee meetings,and otherwise helped shaped the multithread- ing and concurrency support in C++11. Finally,I would like to thank the following people,whose suggestions have greatly improved this book:Dr.Jamie Allsop,Peter Dimov,Howard Hinnant,Rick Molloy, Jonathan Wakely,and Dr.Russel Winder,with special thanks to Russel for his detailed reviews and to Jonathan who,as technical proofreader,painstakingly checked all the content for outright errors in the final manuscript during production.(Any remaining xvii
xvii acknowledgments I will start by saying a big “Thank you” to my wife, Kim, for all the love and support she has given me while writing this book. It has occupied a significant part of my spare time for the last four years, and without her patience, support, and understanding, I couldn’t have managed it. Second, I would like to thank the team at Manning who have made this book possible: Marjan Bace, publisher; Michael Stephens, associate publisher; Cynthia Kane, my development editor; Karen Tegtmeyer, review editor; Linda Recktenwald, my copyeditor; Katie Tennant, my proofreader; and Mary Piergies, the production manager. Without their efforts you would not be reading this book right now. I would also like to thank the other members of the C++ Standards Committee who wrote committee papers on the multithreading facilities: Andrei Alexandrescu, Pete Becker, Bob Blainer, Hans Boehm, Beman Dawes, Lawrence Crowl, Peter Dimov, Jeff Garland, Kevlin Henney, Howard Hinnant, Ben Hutchings, Jan Kristofferson, Doug Lea, Paul McKenney, Nick McLaren, Clark Nelson, Bill Pugh, Raul Silvera, Herb Sutter, Detlef Vollmann, and Michael Wong, plus all those who commented on the papers, discussed them at the committee meetings, and otherwise helped shaped the multithreading and concurrency support in C++11. Finally, I would like to thank the following people, whose suggestions have greatly improved this book: Dr. Jamie Allsop, Peter Dimov, Howard Hinnant, Rick Molloy, Jonathan Wakely, and Dr. Russel Winder, with special thanks to Russel for his detailed reviews and to Jonathan who, as technical proofreader, painstakingly checked all the content for outright errors in the final manuscript during production. (Any remaining