CS 134

Beyond Individual Drives

So far, we've looked at individual block devices—both spinning disks and SSDs. But in many systems—especially servers and enterprise environments—we often want to combine multiple drives or access drives over the network.

  • Duck speaking

    Wait, you mean we can make multiple drives look like one drive? Or make a drive that's actually somewhere else on the network look local?

  • Hedgehog speaking

    That sounds complicated… wouldn't that affect performance?

RAID: Redundant Array of Independent Disks

  • Rabbit speaking

    Fun fact, originally the acronym stood for “Redundant Array of Inexpensive Disks”!

  • L-Floppy speaking

    I guess the people selling expensive solutions to enterprises didn't like that and so it became “Independent” instead.

RAID allows multiple physical drives to appear as a single logical drive. Different RAID levels provide different combinations of

  • Improved reliability through redundancy
  • Better performance through parallelism
  • Increased capacity through combining drives

Here are some common RAID levels:

  • RAID 0 (Striping): Spreads data across multiple drives for better performance
    • Writes are distributed across drives ("striped")
    • No redundancy—if any drive fails, all data is lost
    • Commonly used for temporary data or when speed is critical
    • RAID 0 isn't really redundant at all! A better name for it is JBOD (Just a Bunch Of Disks)
  • RAID 1 (Mirroring): Exact copies of data on multiple drives
    • Data is written to all drives simultaneously
    • Data can be read from any drive (improving read performance)
    • Provides redundancy but doesn't increase capacity
  • RAID 5: Striping with distributed parity
    • Data is striped across drives like RAID 0
    • Designed to allow recovery from a single drive failure by calculating and storing parity information
    • Parity information distributed across all drives
    • Good balance of performance, capacity, and reliability
  • Horse speaking

    Hay! So RAID 0 is faster but risky, RAID 1 is safer but wastes space, and RAID 5 is kind of in the middle?

  • PinkRobot speaking

    That's right.

  • Rabbit speaking

    There's also RAID 6. RAID 6 can handle two drive failures by storing two parity blocks (using two independent kinds of parity).

Implementing RAID

There are two common ways to implement RAID:

  • Hardware RAID: Uses a dedicated RAID controller
    • Controller appears as a single drive to the operating system—is treated just like a normal drive
    • Controller handles all RAID operations
    • Typically faster and more reliable than software RAID
    • Can have battery- or capacitor-backed data protection
  • Software RAID: Uses the operating system to manage RAID
    • OS handles RAID operations, must
      • Handle drive failures and rebuilding
      • Manage stripe sizes and alignment
      • Balance I/O across drives
    • Can be slower than hardware RAID
    • More flexible and can work with any drives

Write Hole Problem

One particularly tricky issue with RAID 5 (and similar parity-based RAID levels) is the “write hole“ problem. If the system crashes during a write, you can end up with inconsistent parity information. Modern RAID systems use various techniques to prevent this problem, such as

  • Write-ahead logging
  • Battery-backed cache
  • Special “atomic write” operations

This issue is similar to some of the consistency issues we'll see when we get to filesystems!

Modern systems might use more complex RAID levels or software-defined storage that's even more flexible. But the basic idea remains: combining multiple drives to improve some combination of performance, reliability, and capacity.

Network Block Devices

Sometimes we want to access block storage that isn't physically attached to our computer. There are several ways to do this:

  • iSCSI (Internet SCSI): Makes a network-attached drive appear local
    • Sends SCSI commands over IP networks
    • Operating system sees it as a normal block device
    • Common in enterprise environments
  • Network Attached Storage (NAS)
    • Usually provides file-level access (we'll cover this more when we get to filesystems!)
    • Some NAS systems can also provide block-level access
    • Popular for home and small business use
  • Storage Area Networks (SAN)
    • Dedicated high-speed networks for storage
    • Can use various protocols (Fiber Channel, iSCSI, etc.)
    • Common in data centers and enterprise environments
  • Duck speaking

    So the NAS I set up in my room for all my media is kind of like a baby version of what big companies use?

  • L-Floppy speaking

    Sort of! The principles are similar, but enterprise storage systems add a lot more complexity for reliability and performance.

  • Duck speaking

    I bet the network adds some latency though?

  • Horse speaking

    Hay! That's why they have those special storage networks!

Challenges of Network Storage

Accessing storage over a network introduces several challenges:

  • Latency: Network storage is slower than local storage
  • Reliability: Network failures can cause storage to become unavailable
  • Security: Network storage must be secured against unauthorized access
  • Performance: Network storage must be optimized for high performance
  • Cat speaking

    It's kind of amazing that applications don't need to know about any of this—they just see a normal block device.

  • Rabbit speaking

    Yes! The operating system handles all the complexity. Though sometimes applications do need to be “storage aware” for better performance.

Now that we understand both the physical media (disks and SSDs) and how we can combine and access them in different ways, we're ready to look at how filesystems use these block devices to store and organize our data! We'll do that next time!

(When logged in, completion status appears here.)