Multi-Level Feedback Queue Scheduling
Let's review what we know so far:
- Shortest-Job First (SJF) scheduling selects the process with the smallest execution time to execute next. It's good at minimizing response time and turnaround time, but it can lead to starvation, and it expects us to predict the future.
- Basic Round-Robin (RR) scheduling gives each process a fixed time slice to run before moving on to the next process. It's fair and simple, but can lead to long response times for short jobs, introduces context-switch overhead, and can cause compute-bound jobs to unduly delay I/O-bound jobs.
- Putting jobs that have become ready after waiting for I/O at the front of the ready queue can help I/O-bound jobs get more CPU time, but doing so can lead to starvation for compute-bound jobs. If a process contrives to do brief I/O right before it would have given up its quantum, it could get more than its fair share of CPU time.
So, let's see if we can do better…
I think we need a way to say some processes are MORE important than others!
I want whatever we come up with to be efficient.
I think we need some better way to figure out which jobs are going to be quick and which are going to be slow.
But you can't predict the future!
No, but I can make a guess based on what I've seen so far.
Exactly!
Introducing the Multi-Level Feedback Queue Scheduler
In this design, our scheduler has multiple queues, each with a different priority level. The first queue is the highest-priority queue, and is for the shortest jobs, and the lowest-priority queue is for the longest jobs. The scheduler will always run the highest-priority job that is ready to run.
Okay, but how do we know which queue to put a job in?
Time Allotments
We limit the amount of time a job can spend in a queue before it is moved to the (end of the) next lower-priority queue. That time block is the queue's time allotment. When a job arrives in a queue for the first time, it's given that queue's time allotment. If the job doesn't finish in that time, it's moved to the next lower-priority queue.
So jobs start out in the highest-priority queue. If they don't finish in the time allotment for that queue, they progressively move to lower-priority queues. Thus short jobs get to run quickly, and long jobs eventually get their turn.
So you're just assuming that a new job is going to be short?
That's right. But that assumption isn't very costly. If we're wrong, the job will soon move to a lower-priority queue. And it means all jobs have a good response time.
Is this the same as the quantum in round-robin?
No. We still have a quantum and move our process to the end of the queue it's in when its quantum is up. The allotment is usually multiple quantums long.
And we can have different quantums for different queues?
Good idea!
Per-Queue Quantum
We saw when we were experimenting with round-robin that a short quantum increases context-switch overhead, but a long quantum can lead to long response times for short jobs. In a multi-level feedback queue, we can have a different quantum for each queue. The highest-priority queue can have a short quantum, and the lower-priority queues can have longer quantums.
So long jobs don't have as much context-switch overhead. I like it.
It's all getting a bit complicated. I think I need to see an example.
I think with all the things this scheduler can do, we need MORE processes and MORE execution time to see how it works.
Our Work
For this demonstration, we've increased our processes to seven, and we've increased the amount of work each process needs to do, totaling to 150 units of work across all the processes. Of particular note:
- Drew regularly does I/O, with each I/O request taking 3 time units. In fact, I/O dominates Drew's execution, so we would call Drew I/O-bound.
- Eve is also doing I/O, but does two units of compute and then one of I/O. Eve is actually trying to exploit any wake-at-front strategy (assuming a quantum of 3).
- Finn and Gray arrive late and are relatively short jobs compared to the others, so we care about their response time and turnaround time.
| Process | Arrival Time | Required Time |
|---|
You can experiment with different scheduling algorithms and parameters to see how they affect the schedule. If you change the seed, the precise details of scenario will change, but the broad strokes will remain the same.
Schedule
In the schedule below, time spent waiting for I/O to complete is shown in a dark teal color, and time spent ready-to-run but not running is shown in a light blue. When running the MLFQ scheduler, the color of the process indicates which queue it is in, with green being the highest-priority queue, olive green the next, and dirty gold the lowest-priority queue. You need to scroll sideways to see the whole chart.
2 2
4 4
Chronology
Although the chart above shows the processes running, it isn't always easy to see exactly why a given process was scheduled next. This scrollable table shows another view of what was going on.
| Time | Running | Ready | Blocked |
|---|
Statistics
Here are all the statistics for the processes in this run:
| Process | Response Time | Turnaround Time | Wait Time | Context Switches |
|---|
Your Thoughts
This is pretty cool. So many settings to play with!
You're just sliding all the sliders around without trying to understand what's going on, aren't you?
I'm learning by doing! I'm doing something!
Let's make sure we can answer some “why” questions, too.
I think I get it. Eve was partway through a burst of compute when their allotment ran out, so they only got to run for one time unit and then got demoted to the next queue.
And when Eve started back up again on queue 2, they were still partway through their burst, so they only got to run for one time unit again.
So if you're partway through a burst when your allotment runs out, you don't get to finish the a full quantum before you're demoted?
That's right.
I guess that's fair. Can't have I/O bound processes hogging the CPU. But actually, I'm thinking, given how well we treated Finn and Gray, if new processes keep arriving, we'll prefer them, and so our long-running processes could STARVE! That's no good.
Boosting Everyone to Avoid Starvation
The MLFQ algorithm has one more trick up its sleeve to avoid starvation. It can boost all processes in all queues to the highest-priority queue at regular intervals. This way, even if a process has been demoted to a lower-priority queue, it will eventually get a chance to run in the highest-priority queue. If it keeps doing compute, it'll soon drop back down, but it won't starve.
Click the button below to add a control that allows you to adjust the boost interval for the MLFQ scheduling algorithm. (You'll need to scroll the simulation to the far right to see the effect of the boost.)
- Scroll back up to the chart where you'll see the enabled control.
Hay! I notice that if we set the boost too early, like at 90, we're not getting much benefit from having multiple queues. We're not giving the long-running processes a chance to drop down to the lower queues.
Agreed. The boost needs to happen after the long-running processes have had a chance to drop down and linger in the lower queues for a bit.
With all these rules, can you give us a reminder of them all, all in one place?
Sure.
Summary of MLFQ Rules
If we say Priority(A) means which queue A is in, with 0 being the highest-priority queue, then we can summarize the rules of the MLFQ scheduler as follows:
- Rule 1: If Priority(A) > Priority(B), A runs (B doesn't).
- Rule 2: If Priority(A) = Priority(B), A and B run in round-robin fashion using the time slice (quantum length) of the given queue.
- Rule 3: When a job enters the system, it is placed at the highest priority (the topmost queue).
- Rule 4: Once a job uses up its time allotment at a given level (regardless of how many times it has given up the CPU), its priority is reduced (i.e., it moves down one queue).
- If the remaining allotment is less than the quantum, it only runs for the remaining allotment, not a full quantum.
- Rule 5: After some time period, boost-time, move all the jobs in the system to the topmost queue.
Complexity
So, what's the complexity of this scheduler?
Let's ask you instead: What do you think the complexity is?
It uses a priority queue, so it's \(\mathrm{O}(\log n)\), right?
MLFQ does not use a priority queue. It uses multiple queues corresponding to a small set of priorities.
I think it's \(\mathrm{O}(1)\). We're just looking at the front of a few queues, and moving processes between queues is also \(\mathrm{O}(1)\).
Hay! What about boosting? That's \(\mathrm{O}(j)\), where \(j\) is the number of processes.
But we only boost every so often, so we can amortize the cost of boosting over many steps, making it effectively \(\mathrm{O}(1)\).
This is pretty cool. We've got a scheduler that's fair, gives short jobs a chance to run quickly, and doesn't let long jobs starve. And it's efficient, too! A modern marvel!
Actually, MLFQ was invented in the 1960s by Fernando Corbato for an operating system called CTSS. His work on CTSS and later on Multics earned him the Turing Award.
Meh. I knew it. More old hat.
Variations continue to be used today, including in various versions of Unix (e.g, Solaris and BSD Unix) and in Windows. They add many more features, but the basic idea is the same.
MORE features and settings? Awesome!
Finding the Best Settings
I noticed that the MLFQ scheduler also has the wake-at-front option. You haven't asked about that yet.
True.
I think maybe we have too many options now. It's hard to know how they all interact.
Also true.
On the one hand, having various settings we can adjust allows a good deal of control. However, to actually know whether particular settings are good or not requires work.
We need to have
- An operating system that actually tracks the performance of its scheduler. Statistics like the ones we've calculated here are essential.
- Representative workloads to test the scheduler against. We need to know how our scheduler will perform in the real world.
- A good understanding of what we want from our scheduler. Do we want to minimize response time? Turnaround time? Wait time? Do we want to be fair? Do we want to avoid starvation? Sometimes these goals are in tension with each other.
I hate to bring this up, but what about multiple cores?
Meh. We're feeling that 1960s vibe, uniprocessors forever and all that.
Actually, I do think we need to touch on multicore scheduling. We'll do that next.
(When logged in, completion status appears here.)