Synchronization
Suppose we decided that we wanted to track the total number of system calls that OS/161 had serviced. It seems like it would be straightforward—just add a global variable,
size_t total_syscalls = 0;
and inside the system call handler, increment it
++total_syscalls;
Simple, right? Well, maybe not. It may feel like this update is one operation, but it's actually three:
- Read the value of
total_syscalls - Increment it
- Write it back
If we were to disassemble the MIPS code for ++total_syscalls, we would see something like
lw $2, total_syscalls
nop
addiu $2, $2, 1
sw $2, total_syscalls
This assembly code really makes it clear that our one line of code is actually three instructions.
Scenarios
You probably saw similar examples in 105. In general, this problem is known as the critical section problem. A critical section is a section of code that accesses shared data. The critical section problem is how to protect shared data from being accessed by multiple entities at the same time.
To see what goes wrong in this case, let's imagine that we have two entities trying to access the total_syscalls variable. We'll call them A and B. We'll also assume that the ++total_syscalls operation is broken down into three steps: read, increment, and write. We'll also assume that the total_syscalls variable is initially 0.
Interleaving
One possibility is interleaving somehow…
| A | B | total_syscalls |
|---|---|---|
Reads 0 into $2 |
0 |
|
Reads 0 into $2 |
0 |
|
Increments $2 to 1 |
0 |
|
| Writes $2 back | 1 |
|
Increments $2 to 1 |
1 |
|
| Writes $2 back | 1 |
This case could happen if a timer interrupt while A was trying to increment total_syscalls. The timer interrupt could cause the CPU to switch to another process, which could read the value of total_syscalls, increment it, and write it back. When A resumes, it would write its updated value (0 + 1 = 1), so the value of total_syscalls would be 1 instead of 2.
Simultaneous Execution (Multiple Cores)
Another approach is literal simultaneous execution…
| A | B | total_syscalls |
|---|---|---|
Reads 0 into $2 |
Reads 0 into $2 |
0 |
Increments $2 to 1 |
Increments $2 to 1 |
0 |
| Writes $2 back | Writes $2 back | 1 |
This example could happen if there were multiple cores, and A and B were running on different cores. Both cores could read the value of total_syscalls, increment it, and write it back. The value of total_syscalls would be 1 instead of 2.
Simple Strategies: Avoid Switching Threads
Inside an OS kernel, to avoid the interleaving scenario, we could disable interrupts while we're updating total_syscalls, which would prevent the timer interrupt from occurring while we're updating total_syscalls.
In OS/161, the code would look something like
int spl = splhigh();
++total_syscalls;
splx(spl);
Simple Strategies: Locking Out Other Cores
To prevent other cores from accessing the total_syscalls variable at the same time, we could use a spinlock. A spinlock is a simple lock that spins in a loop until it can acquire the lock. In OS/161, the code would look something like the following, assuming that we have a spinlock called total_syscalls_lock:
spinlock_acquire(&total_syscalls_lock);
++total_syscalls;
spinlock_release(&total_syscalls_lock);
Whoa! Why do we need to disable interrupts or use a spinlock? Can't we just use a mutex?
A mutex is a more complex lock that can put a thread to sleep if it can't acquire the lock. We can (and should) use mutexes inside the kernel, but some of the time (including in the implementation of mutexes themselves!), we need to use simpler locks that don't involve sleeping.
But didn't we just spend a page complaining about busy waiting? Isn't a spinlock just busy waiting and wasting CPU cycles?
It might seem that way, but in the kernel, we're often dealing with very short critical sections. In these cases, the overhead of putting a thread to sleep and waking it up again is more than the overhead of spinning in a loop. If we're doing something that might take a long time, we probably should use a mutex instead of a spinlock if we can.
So which should we use, the spinlock or disable interrupts? I'm guessing the latter, as the former could cause problems if we're running on multiple cores.
Well, it's not that simple, alas.
Multiple Cores Need Both Methods at Once
Let's see if you can work out what will could go wrong if we have multiple cores and we only use a (simple) spinlock.
The Problem with Spinlocks and Multiple Cores
The problem with (simple) spinlocks in a multicore environment is that they don't prevent timer interrupts, and thus don't prevent a new process from being scheduled on the same core. If we switch to a new process while we're holding a spinlock, everyone will have to wait for the original process to wake back up and start running before it'll be released. In the best case, we'll waste a bunch of CPU cycles spinning in a loop. In the worst case, we'll get ourselves stuck somehow and the original process will never get to wake up and release the lock.
It might seem like the solution for multicore kernel looks more like
int spl = splhigh(); // Avoid this core switching to another process
spinlock_acquire(&total_syscalls_lock); // Lock out other cores
++total_syscalls;
spinlock_release(&total_syscalls_lock);
splx(spl); // Let process switching resume on this core
but in OS/161, the spinlock code actually also disables interrupts, so we don't need to do both. The spinlock_acquire function disables interrupts, and the spinlock_release function enables them again. This is a common pattern in OS kernels.
Ugh. I can see why people don't like writing kernel code. It's so complicated!
Well, as we said, in OS/161 you can just use the spinlock functions and they'll take care of disabling and enabling interrupts for you. But it's important to realize that there's a bit more going on behind the scenes, and that acquining any spinlock in OS/161 also disables interrupts.
Okay… I guess…
And much of the time we'll use mutexes.
That feels a bit more familiar. I sort-of remember
pthread_mutex_lockandpthread_mutex_unlockfrom CS 105.
Yes, that's right. Mutexes are a more general-purpose lock that can put a thread to sleep if it can't acquire the lock.
We just need to write them.
!!! And we'll learn how to do that in the next lesson?
That's right!
…
(When logged in, completion status appears here.)