Semaphores
One of the first general solutions to the critical section problem was proposed by Edsger Dijkstra in 1965: the semaphore.
Motivation — Producer–Consumer Problem
A classic problem in concurrency is the producer–consumer problem, also called the bounded-buffer problem. In this problem, there are two types of threads: producers and consumers. Producers produce items and put them into a shared (fixed-size) buffer, whereas consumers take items out of the buffer and consume them.
- Producers care about how many empty spaces there are in the buffer. If there aren't any empty spaces, they must wait until a consumer consumes an item and creates an empty spot.
- Consumers care about how many items are in the buffer. If there are no items, they must wait until a producer produces an item.
In both cases, there is an “I want an X; if there isn't one, I'll wait” pattern (where, in this case, X is either an empty space or an item). We want the system to track how many Xs are available, and if there aren't any, put the thread to sleep until there are.
A Semaphore Is a Natural-Number Counter
A semaphore is a variable that can be incremented and decremented, but it will never go below zero because a thread that tries to decrement it when it's zero will be put to sleep until it's nonzero.
It thus has two operations:
dec(also calledP,wait,down): If the semaphore is positive, decrement it by one and continue. If it's zero, put the thread to sleep until it's positive, and then decrement it.inc(also calledV,signal,uporpost): Increment the semaphore by one. If the semaphore was zero and there are threads waiting on it, wake one of them up.
Hay! What's with the names
PandV? Those aren't exactly intuitive.
They're from the Dutch words for proberen (test) and verhogen (increment). They were chosen because Dijkstra was Dutch, and back in 1965, all the cool academics wanted to use single-letter names for things.
I noticed that those are the names that OS/161 uses for its semaphore operations.
If this were CS 70, Bjarne Stroustrup would appear here and say “It's part of our heritage!”
I think Prof. Melissa wishes OS/161 had used more descriptive names, notwithstanding the heritage.
Actually, the names
sem_waitandsem_postare problematic too—sem_waitmight make you think that it will always wait, but really it only waits when the semaphore is zero. In fact, even if we usedsem_incandsem_dec, that wouldn't tell the whole story either, as those names don't mention waiting or waking up threads.
Example
As we'd expect, semaphores are a perfect fit for the producer–consumer problem. We can use two semaphores to solve it, as shown in this C program that runs on any modern Unix-like system:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <pthread.h>
#include <semaphore.h>
#include <unistd.h>
#define BUFFER_SIZE 32
#define NUM_PRODUCERS 2
#define NUM_CONSUMERS 2
struct message {
char *content;
};
struct circular_buffer {
struct message buffer[BUFFER_SIZE];
int head;
int tail;
sem_t filled_slot;
sem_t empty_slot;
sem_t mutex;
};
struct thread_args {
struct circular_buffer *cb;
int id;
};
/* Function prototypes */
void message_init(struct message *msg, const char *content);
void message_destroy(struct message *msg);
void circular_buffer_init(struct circular_buffer *cb);
void circular_buffer_destroy(struct circular_buffer *cb);
void *producer(void *arg);
void *consumer(void *arg);
/* Main.
* - Creates the buffer
* - Starts the producer and consumer threads, and then,
* - Wits for them to finish.
*/
int
main(void)
{
struct circular_buffer cb;
pthread_t prod_threads[NUM_PRODUCERS];
pthread_t cons_threads[NUM_CONSUMERS];
struct thread_args prod_args[NUM_PRODUCERS];
struct thread_args cons_args[NUM_CONSUMERS];
int i;
circular_buffer_init(&cb);
for (i = 0; i < NUM_PRODUCERS; i++) {
prod_args[i].cb = &cb;
prod_args[i].id = i;
pthread_create(&prod_threads[i], NULL, producer, &prod_args[i]);
}
for (i = 0; i < NUM_CONSUMERS; i++) {
cons_args[i].cb = &cb;
cons_args[i].id = i;
pthread_create(&cons_threads[i], NULL, consumer, &cons_args[i]);
}
for (i = 0; i < NUM_PRODUCERS; i++)
pthread_join(prod_threads[i], NULL);
for (i = 0; i < NUM_CONSUMERS; i++)
pthread_join(cons_threads[i], NULL);
circular_buffer_destroy(&cb);
return 0;
}
/* The buffer stores `struct message` objects, which are just strings. */
void
message_init(struct message *msg, const char *content)
{
msg->content = strdup(content);
}
void
message_destroy(struct message *msg)
{
free(msg->content);
msg->content = NULL;
}
/* Initialization and destruction for the circular buffer. */
void
circular_buffer_init(struct circular_buffer *cb)
{
cb->head = 0;
cb->tail = 0;
// Usage: sem_init(sem_address, 0, initial_value);
// (the second argument is for advanced Unix features we're not using)
sem_init(&cb->filled_slot, 0, 0);
sem_init(&cb->empty_slot, 0, BUFFER_SIZE);
sem_init(&cb->mutex, 0, 1);
}
void
circular_buffer_destroy(struct circular_buffer *cb)
{
sem_destroy(&cb->filled_slot);
sem_destroy(&cb->empty_slot);
sem_destroy(&cb->mutex);
}
/* Producer and consumer functions. */
void *
producer(void *arg)
{
struct thread_args *args = (struct thread_args *)arg;
struct circular_buffer *cb = args->cb;
int id = args->id;
char message_content[50];
for (int i = 0; i < 10; i++) {
snprintf(message_content, sizeof(message_content),
"Message %d from Producer %d", i, id);
sem_wait(&cb->empty_slot);
sem_wait(&cb->mutex);
message_init(&cb->buffer[cb->tail], message_content);
cb->tail = (cb->tail + 1) % BUFFER_SIZE;
sem_post(&cb->mutex);
sem_post(&cb->filled_slot);
usleep(rand() % 1000000); /* Sleep up to 1 second */
}
/* Send termination message */
sem_wait(&cb->empty_slot);
sem_wait(&cb->mutex);
message_init(&cb->buffer[cb->tail], "Bye");
cb->tail = (cb->tail + 1) % BUFFER_SIZE;
sem_post(&cb->mutex);
sem_post(&cb->filled_slot);
return NULL;
}
void *
consumer(void *arg)
{
struct thread_args *args = (struct thread_args *)arg;
struct circular_buffer *cb = args->cb;
int id = args->id;
struct message msg;
while (1) {
sem_wait(&cb->filled_slot);
sem_wait(&cb->mutex);
msg = cb->buffer[cb->head];
cb->head = (cb->head + 1) % BUFFER_SIZE;
sem_post(&cb->mutex);
sem_post(&cb->empty_slot);
printf("Consumer %d: %s\n", id, msg.content);
if (strcmp(msg.content, "Bye") == 0) {
message_destroy(&msg);
break;
}
message_destroy(&msg);
usleep(rand() % 1000000); /* Sleep up to 1/10 second */
}
return NULL;
}
This code produces output like
Consumer 0: Message 0 from Producer 1
Consumer 1: Message 0 from Producer 0
Consumer 1: Message 1 from Producer 1
Consumer 0: Message 1 from Producer 0
Consumer 1: Message 2 from Producer 1
Consumer 1: Message 3 from Producer 1
Consumer 0: Message 4 from Producer 1
Consumer 1: Message 2 from Producer 0
Consumer 0: Message 5 from Producer 1
Consumer 1: Message 6 from Producer 1
Consumer 0: Message 3 from Producer 0
Consumer 1: Message 4 from Producer 0
Consumer 0: Message 7 from Producer 1
Consumer 0: Message 5 from Producer 0
Consumer 1: Message 8 from Producer 1
Consumer 0: Message 9 from Producer 1
Consumer 1: Message 6 from Producer 0
Consumer 0: Bye
Consumer 1: Message 7 from Producer 0
Consumer 1: Message 8 from Producer 0
Consumer 1: Message 9 from Producer 0
Consumer 1: Bye
You can try this code yourself!
What's mutex For?
The mutex semaphore is used to protect the data stored in struct circular_buffer. In particular, it protects the head and tail indexes. If two threads tried to modify the head and tail indexes at the same time, the code would likely become confused about where the next item should be placed or taken from.
Hay! I just realized where the name
mutexcomes from! It's short for “mutual exclusion,” which is exactly what it's providing here.
Exactly! The
mutexsemaphore is providing mutual exclusion to the critical sections of code that modify theheadandtailindexes. That's a perfect example of the critical section problem, and themutexsemaphore solves it nicely.
Beyond the Bounded-Buffer Problem
Semaphores are a general synchronization primitive that can be used to solve a wide variety of synchronization problems. They can be used to solve the critical section problem, the producer–consumer problem, and many other problems.
But code that uses semaphores can be hard to understand and debug. For example, would the code for the producer still have worked if we had written the following?
sem_wait(&cb->mutex);
sem_wait(&cb->filled_slot);
msg = cb->buffer[cb->head];
cb->head = (cb->head + 1) % BUFFER_SIZE;
sem_post(&cb->empty_slot);
sem_post(&cb->mutex);
In general, the interactions between semaphores can be intricate, and working out what is and isn't okay can be quite tricky.
If you look back at the code for the producer, notice that it decrements empty_slot but increments filled_slot. In the context of this particular problem, this approach may make sense, but the fact that we have to manage careful choreography between two semaphores and two functions even for this “simple” problem is a bit of a red flag.
Takeaways
Semaphores are a powerful synchronization primitive that can be used to solve a wide variety of synchronization problems. They can be used to solve the critical section problem, the producer–consumer problem, and many other problems. For people who love solving puzzles, writing a “clever” semaphore-based solution to an intricate synchronization problem can be fun.
In fact, there is a book, The Little Book of Semaphores, that presents a series of synchronization problems and challenges you to solve them using semaphores.
While there is an intellectual challenge in writing “clever” semaphore-based solutions, they can be hard to understand and debug. So you won't be surprised that while semaphores were one of the first general solutions to the critical section problem, they're not the most popular synchronization primitive in use today.
(When logged in, completion status appears here.)