CS 105

Magnetic Disks

  • Goat speaking

    OMG, we're back in the past again learning about spinning disks. Everyone uses SSDs now!

  • PinkRobot speaking

    Actually, that's not true. While SSDs are faster and more reliable, spinning disks are still used in many applications because they are cheaper and have higher capacity. They are also used in some cloud storage systems because they are more cost-effective for storing large amounts of data.

  • BlueRobot speaking

    But in some ways, yes, disks do feel a bit archiac because they're not “solid state”—they actually have moving parts which is kinda steam-punk in its way.

History and Basics

The RAM chips in a computer are volatile, meaning that they lose their contents when the power is turned off. That volatility is fine for temporary storage, but for long-term storage we need a non-volatile medium where data is retained even when the power is off.

For secondary storage, we desire these properties:

  • Persistence: Data must be retained when the power is off.
  • Reusability: Data must be able to be written, read, and rewritten many times.
  • Random Access: Data must be able to be read and written in any order.
  • Durability: Data must be able to be stored for long periods of time.
  • Reliability: Data must be able to be read and written without errors.
  • Capacity: Secondary storage is typically much larger than primary storage.
  • Economy: Secondary storage is typically cheaper than primary storage.

Fundamentally, we need a way to store each bit of data in a way that can be read and written. Humans have traditionally used materials like marking clay tablets; writing on papyrus, parchment, and later paper with ink; or carving or etching stone or metal. But in the world of electronics, a medium and method based on electromagnetism seems like an obvious choice. Magnetic materials can store a bit using the polarity of the magnetic field.

A Deeper Dive into Magnetic Storage (Optional)

Well before the advent of electronic computers, people had been using magnetic media for audio recording. The idea was first proposed in 1878, with working audio recorders based on steel wire appearing in 1898.

In the 1930s, AEG engineers developed the Magnetophon, the first practical tape recorder using plastic-based magnetic tape (itself based on Fritz Pfleumer's 1928 invention of a paper-based magnetic tape). During World War II, Germany used magnetic-tape recorders extensively, but everyone else continued to depend on wire recorders. After the war, magnetic-tape technology was widely adopted, although wire-recording devices were still used for some purposes (e.g., cockpit voice recorders) for some time.

The first magnetic storage devices in computers used wire or tape similar to that used for audio recordings. In 1950, the U.S. National Bureau of Standards' SEAC (Standards Eastern Automatic Computer) used magnetic-wire storage that could hold about 200,000 bits of data by running a thin wire between spools in a cartridge. UNIVAC I in 1951 used a metal tape, but IBM introduced the 1/2" reel-to-reel plastic tape format that became the standard format for many years. Nine-track 1/2" tape was still in use in the early 1990s by USPS for sending address corrections to companies.

But wire and tape have a significant limitation: they are sequential-access devices. To read a particular piece of data, you have to read through all the data that comes before it on the tape (and rewind the tape to get back to anything that had appeared earlier).

This limitation led to the development of magnetic-drum storage in the late 1940s. Picture a metal cylinder coated with magnetic material constantly spinning at high speed. Multiple read/write heads positioned along the length of the drum could access different “tracks” of data as the drum rotated beneath them. While any given track was still accessed sequentially, the continuous rotation meant you only had to wait for the data to spin around to your read/write head—typically taking a fraction of a second. Many early computers like the IBM 650 used drum memory as their main working storage (not as secondary storage!).

The next logical step was to flatten the drum into a disk. In 1956, IBM introduced the RAMAC (Random Access Method of Accounting and Control) with the IBM 350 disk-storage unit. Picture a stack of 50 metal platters, each 24 inches in diameter, spinning at 1200 RPM. A mechanical arm could move in and out between the platters, positioning read/write heads over any track on any platter. This mechanism provided true random access—the ability to directly seek to any location on any platter without having to scan through other data first.

  • Duck speaking

    What about floppy disks? They're what we use as the save icon in most programs, right?

  • L-Floppy speaking

    Totes!

  • PinkRobot speaking

    Let's talk about the difference.

Floppy Disks vs. Hard Disks

Early disk drives like the IBM 350 were huge and expensive, with high power and cooling requirements. They were meant to be used with large mainframe computers, which had the sorts of space, power, and cooling needs. But as computers became smaller and more affordable, there was a need for smaller, cheaper storage devices. Rather than use a rigid platter made of metal or glass, engineers turned to disk-shaped pieces of flexible plastic like that used for magnetic tape, and enclosed it in a protective sleeve. The result was the floppy disk, a removable magnetic-storage medium that could be inserted into a drive, read from, and written to.

  • Dog speaking

    Wow, I never realized the “hard” in “hard disk” referred to the platters being rigid. I thought it was like “hard” in hardware or something.

  • Hedgehog speaking

    I think I need a picture to understand all this. Can you show me a diagram of what a disk looks like?

  • PinkRobot speaking

    We'll go one better—we'll have a full simulation! For simplicity, we'll begin with a single platter and a single read/write head. We'll add more platters and heads later.

  • L-Floppy speaking

    A single “platter” is all I have!

  • PinkRobot speaking

    We'll begin with a disk with 20 tracks and 18 sectors per track, as that's small enough to fit on the screen but big enough to be interesting. This size is actually similar to a 360 KB floppy disk, which had 40 tracks and 9 sectors per track.

Disk Simulation

LBA Track Platter Sector
Desired
Current

XXX sectors on the disk
Num Target Seek (ms) Rotation (ms) Transfer (ms) Total (ms)
(No operations completed)

Explore!

First, try the following:

  • Click Start to set the disk spinning.
  • Click on a sector on the disk to schedule a read operation
    • The sector you choose will turn brown, and the track it's on will be highlighted in green.
    • The head will move to the selected track and read the sector.
    • In the table below, you'll see the time taken for the seek, rotation, and transfer.

You can click on more sectors to schedule more reads, but they won’t be highlighted until it's their turn. (You can also set up reads before you start the disk spinning.)

You can adjust the view by clicking on the image and dragging vertically. Horizontally dragging rotates the disk.

Once you've clicked on a few sectors, you can press the Start button to begin reading them. The disk will spin, the head will move, and the sectors will be read in order. You can press the Turbo button to speed up the simulation, but it's best initially to watch it at normal speed.

Clicking the Reset button will stop the disk and reset the simulation, including clearing the access history. Try different positions for the sectors you want to read. What results in the fastest speed? What's slowest?

You can also adjust the number of tracks, sectors per track, and platters to see how that affects disk-access times.

  • Horse speaking

    Hay! When I increase the number of platters beyond one, it stops calling them “tracks” and starts calling them “cylinders”. What's up with that?

  • PinkRobot speaking

    If you look at the disk from the side, you can see that the read/write heads all move together. So when you have multiple platters, the heads can access the same track on each platter at the same time. This vertical stack of tracks is called a “cylinder”. So when you have multiple platters, the heads move to a “cylinder” rather than a “track”.

What did you observe about disk-access times? What factors seemed to affect the speed of reading data from the disk? Did you notice anything interesting?

  • Dog speaking

    Did you see how the head moves? It actually speeds up and slows down as it moves across the disk. I guess that's inertia for you!

  • Goat speaking

    OMG! It takes ages for the disk to spin around to the right spot.

  • Cat speaking

    I'd never really thought about any of that until I played with the simulator. Where the data is on the disk really matters! Even on the same track, it can either be right up next, or almost a whole rotation away.

Best and Worst Case

Reset the simulator and make the first read Track 0, Sector 5 (the one immediately to the right of the head).

Let's say that the next read will come from Track 1, the next track over:

  • What is the best sector to read on Track 1 to minimize access time?
  • What is the worst sector to read on Track 1 to maximize access time?

Explain how you figured it out, and also what ramifications your observations have—why does the location of data on the disk matter?

Choosing Sector 14 as the next read gives you the worst case:

Num Target Seek (ms) Rotation (ms) Transfer (ms) Total (ms)
1 0, 0, 5 0.00 0.31 0.62 0.93
2 1, 0, 14 5.52 10.53 0.62 16.67
Total 5.52 10.84 1.24 17.60
Average2.765.420.628.80
Std Dev2.765.110.007.87

But targeting Sector 15 gives you the best case:

Num Target Seek (ms) Rotation (ms) Transfer (ms) Total (ms)
1 0, 0, 5 0.00 0.31 0.62 0.93
2 1, 0, 15 5.52 0.03 0.62 6.17
Total 5.52 0.34 1.24 7.10
Average2.760.170.623.55
Std Dev2.760.140.002.62
  • Dog speaking

    Wow, so a bad choice takes more than twice as long as a good choice! That's a big difference!

  • Cat speaking

    It kind of makes you think how complicated it is to think about what it means for data to be “close” or “far” on a disk.

  • Rabbit speaking

    In fact, Track 3, Sector 5 can be reached in 11.11 ms, so we can make it to a further-away track faster than we can reach a far-away sector on the same track. That's a bit mind-bending!

Reset the disk, and set it to have four platters.

Why would it be a really bad idea to read

Cylinder 0, Head 0, Sector 0
Cylinder 0, Head 1, Sector 0
Cylinder 0, Head 2, Sector 0
Cylinder 0, Head 3, Sector 0?

What would be a better ordering if we had data spread across the platters?

Disk Geometry Revisited

Nothing is 100% reliable. The magnetic surface of the disk might have a few subtle flaws. Which sectors of the disk are more likely to see errors? Why?

Also, how does increasing the number of platters affect the likelihood of overall drive failure?

  • Duck speaking

    I'd guess that the inner tracks are more likely to have errors because data there is more tightly packed into a smaller space.

  • Cat speaking

    And having more platters means we're more likely to see more errors just because there are more platters that could fail.

We could figure out the probability of the drive working for \( n \) platters as

\[ P(\text{Working}) = (1 - P(\text{Whole-Unit Failure})) \times (1 - P(\text{Platter Failure}))^n \]

so the probability of at least one platter failing increases as the number of platters increases.

  • Duck speaking

    I have an idea! Why do we need to always have the same number of sectors per track? Why not have more sectors on the outer tracks where there's more space?

  • Horse speaking

    Hay! But that means some sectors will occupy more angular space than others. So if we keep at constant angular velocity, the outer sectors will pass by faster than the inner sectors. That's a problem because we want to be able to read data off the disk at a constant rate.

  • Duck speaking

    You're right. What if we just change the speed of the disk so that the outer sectors pass by at the same rate as the inner sectors.

  • Hedgehog speaking

    This is all getting more and more complicated!

Click the button below to enable two additional controls for the disk simulator.

  • The first control lets you select the number of zones on the disk. Each zone has a different number of sectors per track, with more sectors per track in the outer zones.
  • The second control lets you adjust the speed of the disk to compensate for the different angular speeds of the sectors in different zones.

Experiment

If you reset the simulator now, it will default to 36 sectors on the outermost zone and 18 sectors on the innermost zone, with four zones, and with variable speed enabled. Try reading from different zones and see how the disk speed changes. Notice that the transfer time is exactly the same is it was before, even though we've got more sectors on the outer tracks.

What do you think? What's good about this approach? Any concerns?

  • Dog speaking

    I think it's pretty cool. We got a 50% increase in capacity just by using the outer tracks more efficiently!

  • Hedgehog speaking

    But it's also more complicated. We have to think about how to lay out the data on the disk, and how to adjust the disk speed to compensate for the different angular speeds of the sectors. It's making my head spin!

  • Goat speaking

    Meh. I don't think having a drive that is constantly changing its spin speed is a good idea. It's just one more thing that could go wrong. I don't think anyone seriously does this.

  • Rabbit speaking

    Actually, various rotational media does adopt a variable speed approach, from the 800 KB floppy drives of the original Macintosh to various optical media. But it's true that it's not common in hard drives.

Although hard drives generally stay spinning at a constant rate, they do adopt the approach of having more sectors on the outer tracks, which is called zone bit recording (ZBR) and is used on current hard drives to increase capacity. The original approach we began with, where all tracks have the same number of sectors, is called constant angular velocity (CAV).

Because ZBR doesn't change the disk's rotation speed, it instead requires the drive's electronics to handle data coming off the disk at different rates. ZBR also means that the transfer rate is faster on the outer tracks than the inner tracks.

Thus,

  • Tracks in outer zones have more sectors and a faster transfer rate.
  • Tracks in inner zones have fewer sectors and a slower transfer rate.
  • Hedgehog speaking

    Gah! So now where we put the data has gotten even more complicated! There are good places and bad places, and it's not even just about seek times and rotational delays.

  • Goat speaking

    This is why you should use an SSD! It's so much simpler!

  • PinkRobot speaking

    SSDs have their own issues, as we'll see in the next lesson.

  • Hedgehog speaking

    Cylinders, tracks, heads, and sectors, and complicated disk layouts… Surely we could simplify this somehow?

  • PinkRobot speaking

    Abstraction to the rescue!

Logical Block Addressing

One way to (somewhat) tame the complexity of disk geometry is to use logical block addressing (LBA). Instead of referring to data on the disk by its physical location (cylinder, head, sector), we refer to it by a logical block number. An abstraction layer translates the logical block number to the corresponding location on the disk. Better yet, the abstraction layer is almost always built into the disk controller, so the operating system doesn't have to worry about it.


Experiment

Now, as well as clicking on individual sectors to schedule them for reading, you have a text box where you can enter a single sector, a range of sectors, or a list of sectors to read. The simulator will read them in order, and you can see the total time taken to read them all. Play around to get a feel for what LBA is like.

  • L-Chippy speaking

    The simulator won't prevent you from trying to read the entire disk. But your browser might not be too happy, and it'll take a while! You can always click reset or reload the page if you go too far…

What did you observe about using LBA? How does it simplify things? What are the downsides?

To help you investigate, reset to the default, turn off variable spin speed, and try these ranges:

  1. 5-9 to read sectors 5 through 9
  2. 520-524 to read sectors 520 through 524
  3. 525-529 to read sectors 525 through 529
  • Cat speaking

    I noticed that the first read was faster than the other two. That's because the sectors were on the outer edge of the disk, and so had a faster transfer rate, right?

  • Duck speaking

    Also, the second read was slower than the other two because it spanned two tracks, and so there was a seek between the two tracks.

  • Horse speaking

    Whoa! There was a rotational delay, too. In fact, I think there was more delay than there needed to be. I don't think the mapping from LBA to physical location was optimal.

  • Goat speaking

    Meh. Probably keeps the microcontroller in the disk controller a bit simpler, and the manufacturer can keep the details of the disk layout secret, so who's gonna know enough to complain? Worse is better and all that.

  • PinkRobot speaking

    That's actually a good point.

Trade-offs

The abstraction of LBA simplifies the operating system's job, but it also means that the operating system can't optimize the disk layout for the best performance. That trade-off is generally worthwhile, but it does mean that the operating system can't get the very best possible performance out of the disk.

  • Duck speaking

    I still think we need to fix some things. When we have a bunch of accesses to do, we could optimize the order of the accesses to minimize seek times and rotational delays. I don't know if we can do that with LBA, but we ought to try.

  • PinkRobot speaking

    We'll talk about that in the next lesson page.

(When logged in, completion status appears here.)