Free-Space Management
We've seen how filesystems store their files, but an important part of a filesystem is managing free space on the disk. When a file is created, the filesystem needs to find space to store it. When a file is deleted, the filesystem needs to mark that space as free. And when a file is extended, the filesystem needs to find more space to store the additional data.
Free-Space Lists
One way to manage free space is exactly the same as we manage files. Just as a file is a sequence of blocks, we can have a sequence of blocks that represent free space—a free-space list.
One issue with a free-space list is ordering and coalescing. Let's imagine we have an extents-style filesystem, where files are stored in contiguous blocks where possible. Let's suppose our free-space list looks like
[1-3, 5-7, 9-11, 18-23]
Now let's suppose a file that has the extents [13-16, 4-4] is deleted. We could just add these blocks onto the end of our free-space list, like
[1-3, 5-7, 9-11, 18-23, 13-16, 4-4]
But now our free list doesn't have the nice property of having the blocks in order, nor does it capture the fact that the range 1-7 is now entirely free because block 4 joins together blocks 1-3 and 5-7. We could fix this issue by sorting the list and then coalescing adjacent blocks, but doing so is expensive as things will shift around a lot on a busy filesystem.
Free-Space Bitmaps
A more common way to manage free space is with a bitmap. A bitmap is a sequence of bits where each bit corresponds to a block on the disk. If the bit is 1, the block is free; if the bit is 0, the block is in use. For example, if we have a disk with 64 blocks, we might represent our original free-space list as
0000000000000000000000000000000000000000111111000000111011101110
(Note that bit 0 (corresponding to block zero) is on the right of the binary number and bit 63 is on the left!)
Now if we free up blocks 13–16 and 4, we can just set these bits to 1 (using an or operation):
0000000000000000000000000000000000000000000000011110000000010000
resulting in the final free-space bitmap:
0000000000000000000000000000000000000000111111011110111011111110
But we'd need a bit for each block, right? Won't that add up to a lot of bits?
For a disk with a 4 KB logical block size, it will take up 1 / (4096 × 8) ≈ 0.003% of the disk space. That's not too bad!
And it seems pretty easy to code, so I'm on board!
Free-Space Zones
One final trick we can apply is not lumping all our free space together. We can create little worlds of free space, each one serving the files that live in that zone. If two files are growing, but they're in different zones, each one will get contiguous blocks from its own zone. So this method can help reduce fragmentation.
Marshall Kirk McKusick's “fast file system” (FFS) for BSD used “cylinder groups” to create different zones with their own sets of inodes and data blocks. (As disk sizes grew, the previous file system (FS) had problems with “thrashing” as the drives had to keep moving the heads back and forth across the disk between the grouped inodes and the data blocks they referred to on other parts of the disk.) Other filesystems such as NTFS and ext4 have employed similar ideas.
How do you decide how many zones to have? And how do you know which zone to put a new file in?
And what if a file grows and needs MORE space than its zone has?
Good questions!
How many zones you need comes down to empirical tuning. Like Goldilocks, you want to have not too many and not too few for the typical workloads the system will have. A single-user system running only a few programs might not need many zones, whereas a server with many users and many programs running at once might need lots. Usually, a default value is chosen that works well for most cases, but might need to be adjusted for specific workloads.
Allocating Files to Zones
The Berkeley FFS system used the following strategy to allocate files to zones:
- For files, use the same zone as the file's directory entry (assumes that files in the same directory usually aren't all growing at once, or, if they are, they're likely to be related and thus accessed together).
- When creating a new directory, use the zone that has the most free space, so long as it hasn't been chosen recently.
Small Files vs. Large Files
Because Unix filesystems typically have many, many small files, using full blocks results in a lot of wasted space. FFS allows filesystem blocks to be broken into fragments, which allow multiple small files to share the same block, saving space.
On the flip side, large files being added to a cylinder block can both require additional allocations in other, nearby cylinder blocks if they need more space than is available in a cylinder group, potentially slowing reads and writes, and, by using lots of space in an existing cylinder block, can make growing other, smaller files end up having to span cylinder blocks as well. FFS proactively creates large new files in nearby cylinder groups to keep some free space available in the existing cylinder block, and similarly redirects to another cylinder group for each 1 MB written to the file.
Meh. Cool history lesson. Are we done yet?
I think we are, actually!
(When logged in, completion status appears here.)