Semaphores PDF
Document Details
Uploaded by LavishKrypton
National University of Singapore
Tags
Summary
This document details semaphores, a synchronization primitive in operating systems. It explores their use in concurrency and provides examples and explanations of their functionality. The document also discusses how to implement semaphores using various techniques.
Full Transcript
31 Semaphores As we know now, one needs both locks and condition variables to solve a broad range of relevant and interesting concurrency problems. One of the first people to realize this years ago was Edsger Dijkstra (...
31 Semaphores As we know now, one needs both locks and condition variables to solve a broad range of relevant and interesting concurrency problems. One of the first people to realize this years ago was Edsger Dijkstra (though it is hard to know the exact history [GR92]), known among other things for his famous “shortest paths” algorithm in graph theory [D59], an early polemic on structured programming entitled “Goto Statements Consid- ered Harmful” [D68a] (what a great title!), and, in the case we will study here, the introduction of a synchronization primitive called the semaphore [D68b, D72]. Indeed, Dijkstra and colleagues invented the semaphore as a single primitive for all things related to synchronization; as you will see, one can use semaphores as both locks and condition variables. T HE C RUX : H OW T O U SE S EMAPHORES How can we use semaphores instead of locks and condition variables? What is the definition of a semaphore? What is a binary semaphore? Is it straightforward to build a semaphore out of locks and condition vari- ables? To build locks and condition variables out of semaphores? 31.1 Semaphores: A Definition A semaphore is an object with an integer value that we can manipulate with two routines; in the POSIX standard, these routines are sem wait() and sem post()1. Because the initial value of the semaphore deter- mines its behavior, before calling any other routine to interact with the semaphore, we must first initialize it to some value, as the code in Figure 31.1 does. 1 Historically, sem wait() was called P() by Dijkstra and sem post() called V(). These shortened forms come from Dutch words; interestingly, which Dutch words they supposedly derive from has changed over time. Originally, P() came from “passering” (to pass) and V() from “vrijgave” (release); later, Dijkstra wrote P() was from “prolaag”, a contraction of “probeer” (Dutch for “try”) and “verlaag” (“decrease”), and V() from “verhoog” which means “increase”. Sometimes, people call them down and up. Use the Dutch versions to impress your friends, or confuse them, or both. See https://news.ycombinator.com/ item?id=8761539) for details. 1 2 S EMAPHORES 1 #include 2 sem_t s; 3 sem_init(&s, 0, 1); Figure 31.1: Initializing A Semaphore In the figure, we declare a semaphore s and initialize it to the value 1 by passing 1 in as the third argument. The second argument to sem init() will be set to 0 in all of the examples we’ll see; this indicates that the semaphore is shared between threads in the same process. See the man page for details on other usages of semaphores (namely, how they can be used to synchronize access across different processes), which require a different value for that second argument. After a semaphore is initialized, we can call one of two functions to interact with it, sem wait() or sem post(). The behavior of these two functions is seen in Figure 31.2. For now, we are not concerned with the implementation of these rou- tines, which clearly requires some care; with multiple threads calling into sem wait() and sem post(), there is the obvious need for managing these critical sections. We will now focus on how to use these primitives; later we may discuss how they are built. We should discuss a few salient aspects of the interfaces here. First, we can see that sem wait() will either return right away (because the value of the semaphore was one or higher when we called sem wait()), or it will cause the caller to suspend execution waiting for a subsequent post. Of course, multiple calling threads may call into sem wait(), and thus all be queued waiting to be woken. Second, we can see that sem post() does not wait for some particular condition to hold like sem wait() does. Rather, it simply increments the value of the semaphore and then, if there is a thread waiting to be woken, wakes one of them up. Third, the value of the semaphore, when negative, is equal to the num- ber of waiting threads [D68b]. Though the value generally isn’t seen by users of the semaphores, this invariant is worth knowing and perhaps can help you remember how a semaphore functions. 1 int sem_wait(sem_t *s) { 2 decrement the value of semaphore s by one 3 wait if value of semaphore s is negative 4 } 5 6 int sem_post(sem_t *s) { 7 increment the value of semaphore s by one 8 if there are one or more threads waiting, wake one 9 } Figure 31.2: Semaphore: Definitions Of Wait And Post O PERATING S YSTEMS WWW. OSTEP. ORG [V ERSION 1.10] S EMAPHORES 3 1 sem_t m; 2 sem_init(&m, 0, X); // init to X; what should X be? 3 4 sem_wait(&m); 5 // critical section here 6 sem_post(&m); Figure 31.3: A Binary Semaphore (That Is, A Lock) Don’t worry (yet) about the seeming race conditions possible within the semaphore; assume that the actions they make are performed atomi- cally. We will soon use locks and condition variables to do just this. 31.2 Binary Semaphores (Locks) We are now ready to use a semaphore. Our first use will be one with which we are already familiar: using a semaphore as a lock. See Figure 31.3 for a code snippet; therein, you’ll see that we simply surround the critical section of interest with a sem wait()/sem post() pair. Criti- cal to making this work, though, is the initial value of the semaphore m (initialized to X in the figure). What should X be?... (Try thinking about it before going on)... Looking back at definition of the sem wait() and sem post() rou- tines above, we can see that the initial value should be 1. To make this clear, let’s imagine a scenario with two threads. The first thread (Thread 0) calls sem wait(); it will first decrement the value of the semaphore, changing it to 0. Then, it will wait only if the value is not greater than or equal to 0. Because the value is 0, sem wait() will simply return and the calling thread will continue; Thread 0 is now free to enter the critical section. If no other thread tries to acquire the lock while Thread 0 is inside the critical section, when it calls sem post(), it will simply restore the value of the semaphore to 1 (and not wake a waiting thread, because there are none). Figure 31.4 shows a trace of this scenario. A more interesting case arises when Thread 0 “holds the lock” (i.e., it has called sem wait() but not yet called sem post()), and another thread (Thread 1) tries to enter the critical section by calling sem wait(). In this case, Thread 1 will decrement the value of the semaphore to -1, and Value of Semaphore Thread 0 Thread 1 1 1 call sem wait() 0 sem wait() returns 0 (crit sect) 0 call sem post() 1 sem post() returns Figure 31.4: Thread Trace: Single Thread Using A Semaphore T HREE © 2008–23, A RPACI -D USSEAU E ASY P IECES 4 S EMAPHORES Val Thread 0 State Thread 1 State 1 Run Ready 1 call sem wait() Run Ready 0 sem wait() returns Run Ready 0 (crit sect begin) Run Ready 0 Interrupt; Switch→T1 Ready Run 0 Ready call sem wait() Run -1 Ready decr sem Run -1 Ready (semlock, 0, 1); 10 sem_init(&rw->writelock, 0, 1); 11 } 12 13 void rwlock_acquire_readlock(rwlock_t *rw) { 14 sem_wait(&rw->lock); 15 rw->readers++; 16 if (rw->readers == 1) // first reader gets writelock 17 sem_wait(&rw->writelock); 18 sem_post(&rw->lock); 19 } 20 21 void rwlock_release_readlock(rwlock_t *rw) { 22 sem_wait(&rw->lock); 23 rw->readers--; 24 if (rw->readers == 0) // last reader lets it go 25 sem_post(&rw->writelock); 26 sem_post(&rw->lock); 27 } 28 29 void rwlock_acquire_writelock(rwlock_t *rw) { 30 sem_wait(&rw->writelock); 31 } 32 33 void rwlock_release_writelock(rwlock_t *rw) { 34 sem_post(&rw->writelock); 35 } Figure 31.13: A Simple Reader-Writer Lock Finally, it should be noted that reader-writer locks should be used with some caution. They often add more overhead (especially with more sophisticated implementations), and thus do not end up speeding up performance as compared to just using simple and fast locking primi- tives [CB08]. Either way, they showcase once again how we can use semaphores in an interesting and useful way. O PERATING S YSTEMS WWW. OSTEP. ORG [V ERSION 1.10] S EMAPHORES 13 T IP : S IMPLE A ND D UMB C AN B E B ETTER (H ILL’ S L AW ) You should never underestimate the notion that the simple and dumb approach can be the best one. With locking, sometimes a simple spin lock works best, because it is easy to implement and fast. Although something like reader/writer locks sounds cool, they are complex, and complex can mean slow. Thus, always try the simple and dumb approach first. This idea, of appealing to simplicity, is found in many places. One early source is Mark Hill’s dissertation [H87], which studied how to design caches for CPUs. Hill found that simple direct-mapped caches worked better than fancy set-associative designs (one reason is that in caching, simpler designs enable faster lookups). As Hill succinctly summarized his work: “Big and dumb is better.” And thus we call this similar advice Hill’s Law. 31.6 The Dining Philosophers One of the most famous concurrency problems posed, and solved, by Dijkstra, is known as the dining philosopher’s problem [D71]. The prob- lem is famous because it is fun and somewhat intellectually interesting; however, its practical utility is low. However, its fame forces its inclu- sion here; indeed, you might be asked about it on some interview, and you’d really hate your OS professor if you miss that question and don’t get the job. Conversely, if you get the job, please feel free to send your OS professor a nice note, or some stock options. The basic setup for the problem is this (as shown in Figure 31.14): as- sume there are five “philosophers” sitting around a table. Between each pair of philosophers is a single fork (and thus, five total). The philoso- phers each have times where they think, and don’t need any forks, and times where they eat. In order to eat, a philosopher needs two forks, both the one on their left and the one on their right. The contention for these forks, and the synchronization problems that ensue, are what makes this a problem we study in concurrent programming. Here is the basic loop of each philosopher, assuming each has a unique thread identifier p from 0 to 4 (inclusive): while (1) { think(); get_forks(p); eat(); put_forks(p); } The key challenge, then, is to write the routines get forks() and put forks() such that there is no deadlock, no philosopher starves and T HREE © 2008–23, A RPACI -D USSEAU E ASY P IECES 14 S EMAPHORES f2 P1 P2 f1 f3 P0 P3 f0 f4 P4 Figure 31.14: The Dining Philosophers never gets to eat, and concurrency is high (i.e., as many philosophers can eat at the same time as possible). Following Downey’s solutions [D08], we’ll use a few helper functions to get us towards a solution. They are: int left(int p) { return p; } int right(int p) { return (p + 1) % 5; } When philosopher p wishes to refer to the fork on their left, they sim- ply call left(p). Similarly, the fork on the right of a philosopher p is referred to by calling right(p); the modulo operator therein handles the one case where the last philosopher (p=4) tries to grab the fork on their right, which is fork 0. We’ll also need some semaphores to solve this problem. Let us assume we have five, one for each fork: sem t forks. Broken Solution We attempt our first solution to the problem. Assume we initialize each semaphore (in the forks array) to a value of 1. Assume also that each philosopher knows its own number (p). We can thus write the get forks() and put forks() routine (Figure 31.15, page 15). The intuition behind this (broken) solution is as follows. To acquire the forks, we simply grab a “lock” on each one: first the one on the left, O PERATING S YSTEMS WWW. OSTEP. ORG [V ERSION 1.10] S EMAPHORES 15 1 void get_forks(int p) { 2 sem_wait(&forks[left(p)]); 3 sem_wait(&forks[right(p)]); 4 } 5 6 void put_forks(int p) { 7 sem_post(&forks[left(p)]); 8 sem_post(&forks[right(p)]); 9 } Figure 31.15: The get forks() And put forks() Routines 1 void get_forks(int p) { 2 if (p == 4) { 3 sem_wait(&forks[right(p)]); 4 sem_wait(&forks[left(p)]); 5 } else { 6 sem_wait(&forks[left(p)]); 7 sem_wait(&forks[right(p)]); 8 } 9 } Figure 31.16: Breaking The Dependency In get forks() and then the one on the right. When we are done eating, we release them. Simple, no? Unfortunately, in this case, simple means broken. Can you see the problem that arises? Think about it. The problem is deadlock. If each philosopher happens to grab the fork on their left before any philosopher can grab the fork on their right, each will be stuck holding one fork and waiting for another, forever. Specifi- cally, philosopher 0 grabs fork 0, philosopher 1 grabs fork 1, philosopher 2 grabs fork 2, philosopher 3 grabs fork 3, and philosopher 4 grabs fork 4; all the forks are acquired, and all the philosophers are stuck waiting for a fork that another philosopher possesses. We’ll study deadlock in more detail soon; for now, it is safe to say that this is not a working solution. A Solution: Breaking The Dependency The simplest way to attack this problem is to change how forks are ac- quired by at least one of the philosophers; indeed, this is how Dijkstra himself solved the problem. Specifically, let’s assume that philosopher 4 (the highest numbered one) gets the forks in a different order than the others (Figure 31.16); the put forks() code remains the same. Because the last philosopher tries to grab right before left, there is no situation where each philosopher grabs one fork and is stuck waiting for another; the cycle of waiting is broken. Think through the ramifications of this solution, and convince yourself that it works. T HREE © 2008–23, A RPACI -D USSEAU E ASY P IECES 16 S EMAPHORES There are other “famous” problems like this one, e.g., the cigarette smoker’s problem or the sleeping barber problem. Most of them are just excuses to think about concurrency; some of them have fascinating names. Look them up if you are interested in learning more, or just get- ting more practice thinking in a concurrent manner [D08]. 31.7 Thread Throttling One other simple use case for semaphores arises on occasion, and thus we present it here. The specific problem is this: how can a programmer prevent “too many” threads from doing something at once and bogging the system down? Answer: decide upon a threshold for “too many”, and then use a semaphore to limit the number of threads concurrently executing the piece of code in question. We call this approach throttling [T99], and consider it a form of admission control. Let’s consider a more specific example. Imagine that you create hun- dreds of threads to work on some problem in parallel. However, in a certain part of the code, each thread acquires a large amount of mem- ory to perform part of the computation; let’s call this part of the code the memory-intensive region. If all of the threads enter the memory-intensive region at the same time, the sum of all the memory allocation requests will exceed the amount of physical memory on the machine. As a result, the machine will start thrashing (i.e., swapping pages to and from the disk), and the entire computation will slow to a crawl. A simple semaphore can solve this problem. By initializing the value of the semaphore to the maximum number of threads you wish to enter the memory-intensive region at once, and then putting a sem wait() and sem post() around the region, a semaphore can naturally throttle the number of threads that are ever concurrently in the dangerous region of the code. 31.8 How To Implement Semaphores Finally, let’s use our low-level synchronization primitives, locks and condition variables, to build our own version of semaphores called... (drum roll here)... Zemaphores. This task is fairly straightforward, as you can see in Figure 31.17 (page 17). In the code above, we use just one lock and one condition variable, plus a state variable to track the value of the semaphore. Study the code for yourself until you really understand it. Do it! One subtle difference between our Zemaphore and pure semaphores as defined by Dijkstra is that we don’t maintain the invariant that the value of the semaphore, when negative, reflects the number of waiting threads; indeed, the value will never be lower than zero. This behavior is easier to implement and matches the current Linux implementation. O PERATING S YSTEMS WWW. OSTEP. ORG [V ERSION 1.10] S EMAPHORES 17 1 typedef struct __Zem_t { 2 int value; 3 pthread_cond_t cond; 4 pthread_mutex_t lock; 5 } Zem_t; 6 7 // only one thread can call this 8 void Zem_init(Zem_t *s, int value) { 9 s->value = value; 10 Cond_init(&s->cond); 11 Mutex_init(&s->lock); 12 } 13 14 void Zem_wait(Zem_t *s) { 15 Mutex_lock(&s->lock); 16 while (s->value cond, &s->lock); 18 s->value--; 19 Mutex_unlock(&s->lock); 20 } 21 22 void Zem_post(Zem_t *s) { 23 Mutex_lock(&s->lock); 24 s->value++; 25 Cond_signal(&s->cond); 26 Mutex_unlock(&s->lock); 27 } Figure 31.17: Implementing Zemaphores With Locks And CVs Curiously, building condition variables out of semaphores is a much trickier proposition. Some highly experienced concurrent programmers tried to do this in the Windows environment, and many different bugs ensued [B04]. Try it yourself, and see if you can figure out why building condition variables out of semaphores is more challenging of a problem than it might appear. 31.9 Summary Semaphores are a powerful and flexible primitive for writing concur- rent programs. Some programmers use them exclusively, shunning locks and condition variables, due to their simplicity and utility. In this chapter, we have presented just a few classic problems and solu- tions. If you are interested in finding out more, there are many other ma- terials you can reference. One great (and free reference) is Allen Downey’s book on concurrency and programming with semaphores [D08]. This book has lots of puzzles you can work on to improve your understand- T HREE © 2008–23, A RPACI -D USSEAU E ASY P IECES 18 S EMAPHORES T IP : B E C AREFUL W ITH G ENERALIZATION The abstract technique of generalization can thus be quite useful in sys- tems design, where one good idea can be made slightly broader and thus solve a larger class of problems. However, be careful when generalizing; as Lampson warns us “Don’t generalize; generalizations are generally wrong” [L83]. One could view semaphores as a generalization of locks and condition variables; however, is such a generalization needed? And, given the dif- ficulty of realizing a condition variable on top of a semaphore, perhaps this generalization is not as general as you might think. ing of both semaphores in specific and concurrency in general. Becoming a real concurrency expert takes years of effort; going beyond what you learn in this class is undoubtedly the key to mastering such a topic. O PERATING S YSTEMS WWW. OSTEP. ORG [V ERSION 1.10] S EMAPHORES 19 References [B04] “Implementing Condition Variables with Semaphores” by Andrew Birrell. December 2004. An interesting read on how difficult implementing CVs on top of semaphores really is, and the mistakes the author and co-workers made along the way. Particularly relevant because the group had done a ton of concurrent programming; Birrell, for example, is known for (among other things) writing various thread-programming guides. [CB08] “Real-world Concurrency” by Bryan Cantrill, Jeff Bonwick. ACM Queue. Volume 6, No. 5. September 2008. A nice article by some kernel hackers from a company formerly known as Sun on the real problems faced in concurrent code. [CHP71] “Concurrent Control with Readers and Writers” by P.J. Courtois, F. Heymans, D.L. Parnas. Communications of the ACM, 14:10, October 1971. The introduction of the reader-writer problem, and a simple solution. Later work introduced more complex solutions, skipped here because, well, they are pretty complex. [D59] “A Note on Two Problems in Connexion with Graphs” by E. W. Dijkstra. Numerische Mathematik 1, 269–271, 1959. Available: http://www-m3.ma.tum.de/twiki/pub/MN0506/ WebHome/dijkstra.pdf. Can you believe people worked on algorithms in 1959? We can’t. Even before computers were any fun to use, these people had a sense that they would transform the world... [D68a] “Go-to Statement Considered Harmful” by E.W. Dijkstra. CACM, volume 11(3), March 1968. http://www.cs.utexas.edu/users/EWD/ewd02xx/EWD215.PDF. Sometimes thought of as the beginning of the field of software engineering. [D68b] “The Structure of the THE Multiprogramming System” by E.W. Dijkstra. CACM, vol- ume 11(5), 1968. One of the earliest papers to point out that systems work in computer science is an engaging intellectual endeavor. Also argues strongly for modularity in the form of layered systems. [D72] “Information Streams Sharing a Finite Buffer” by E.W. Dijkstra. Information Processing Letters 1, 1972. http://www.cs.utexas.edu/users/EWD/ewd03xx/EWD329.PDF. Did Dijkstra invent everything? No, but maybe close. He certainly was the first to clearly write down what the problems were in concurrent code. However, practitioners in OS design knew of many of the problems described by Dijkstra, so perhaps giving him too much credit would be a misrepresentation. [D08] “The Little Book of Semaphores” by A.B. Downey. Available at the following site: http://greenteapress.com/semaphores/. A nice (and free!) book about semaphores. Lots of fun problems to solve, if you like that sort of thing. [D71] “Hierarchical ordering of sequential processes” by E.W. Dijkstra. Available online here: http://www.cs.utexas.edu/users/EWD/ewd03xx/EWD310.PDF. Presents numerous con- currency problems, including Dining Philosophers. The wikipedia page about this problem is also useful. [GR92] “Transaction Processing: Concepts and Techniques” by Jim Gray, Andreas Reuter. Morgan Kaufmann, September 1992. The exact quote that we find particularly humorous is found on page 485, at the top of Section 8.8: “The first multiprocessors, circa 1960, had test and set instruc- tions... presumably the OS implementors worked out the appropriate algorithms, although Dijkstra is generally credited with inventing semaphores many years later.” Oh, snap! [H87] “Aspects of Cache Memory and Instruction Buffer Performance” by Mark D. Hill. Ph.D. Dissertation, U.C. Berkeley, 1987. Hill’s dissertation work, for those obsessed with caching in early systems. A great example of a quantitative dissertation. [L83] “Hints for Computer Systems Design” by Butler Lampson. ACM Operating Systems Review, 15:5, October 1983. Lampson, a famous systems researcher, loved using hints in the design of computer systems. A hint is something that is often correct but can be wrong; in this use, a signal() is telling a waiting thread that it changed the condition that the waiter was waiting on, but not to trust that the condition will be in the desired state when the waiting thread wakes up. In this paper about hints for designing systems, one of Lampson’s general hints is that you should use hints. It is not as confusing as it sounds. [T99] “Re: NT kernel guy playing with Linux” by Linus Torvalds. June 27, 1999. Available: https://yarchive.net/comp/linux/semaphores.html. A response from Linus himself about the utility of semaphores, including the throttling case we mention in the text. As always, Linus is slightly insulting but quite informative. T HREE © 2008–23, A RPACI -D USSEAU E ASY P IECES 20 S EMAPHORES Homework (Code) In this homework, we’ll use semaphores to solve some well-known concurrency problems. Many of these are taken from Downey’s excellent “Little Book of Semaphores”3 , which does a good job of pulling together a number of classic problems as well as introducing a few new variants; interested readers should check out the Little Book for more fun. Each of the following questions provides a code skeleton; your job is to fill in the code to make it work given semaphores. On Linux, you will be using native semaphores; on a Mac (where there is no semaphore support), you’ll have to first build an implementation (using locks and condition variables, as described in the chapter). Good luck! Questions 1. The first problem is just to implement and test a solution to the fork/join problem, as described in the text. Even though this solution is described in the text, the act of typing it in on your own is worthwhile; even Bach would rewrite Vivaldi, allowing one soon-to-be master to learn from an existing one. See fork-join.c for details. Add the call sleep(1) to the child to ensure it is working. 2. Let’s now generalize this a bit by investigating the rendezvous problem. The problem is as follows: you have two threads, each of which are about to enter the rendezvous point in the code. Neither should exit this part of the code before the other enters it. Consider using two semaphores for this task, and see rendezvous.c for details. 3. Now go one step further by implementing a general solution to barrier syn- chronization. Assume there are two points in a sequential piece of code, called P1 and P2. Putting a barrier between P1 and P2 guarantees that all threads will execute P1 before any one thread executes P2. Your task: write the code to implement a barrier() function that can be used in this man- ner. It is safe to assume you know N (the total number of threads in the running program) and that all N threads will try to enter the barrier. Again, you should likely use two semaphores to achieve the solution, and some other integers to count things. See barrier.c for details. 4. Now let’s solve the reader-writer problem, also as described in the text. In this first take, don’t worry about starvation. See the code in reader-writer.c for details. Add sleep() calls to your code to demonstrate it works as you expect. Can you show the existence of the starvation problem? 5. Let’s look at the reader-writer problem again, but this time, worry about starvation. How can you ensure that all readers and writers eventually make progress? See reader-writer-nostarve.c for details. 6. Use semaphores to build a no-starve mutex, in which any thread that tries to acquire the mutex will eventually obtain it. See the code in mutex-nostarve.c for more information. 7. Liked these problems? See Downey’s free text for more just like them. And don’t forget, have fun! But, you always do when you write code, no? 3 Available: http://greenteapress.com/semaphores/. O PERATING S YSTEMS WWW. OSTEP. ORG [V ERSION 1.10]