An Introduction to MVS Volume and File Structure

Mark Kampe markk@cs.ucla.edu

1. Introduction

The basic disk volume structure that IBM uses in its server class systems is remarkably unchanged since it was first introduced in the 1960s. We can liken it to a shark. Sharks are relatively primitive creatures, who have been swimming our oceans since 100 million years before the first dinosaurs. How have such "primitive" creatures survived the changes and cataclysms that have destroyed countless thousands of other species? Simple: they are very good at what they do.

Since the dawn of disks, performance has been a simple matter of minimizing seeks, and doing the largest possible transfers. Much of the complexity in modern file systems is found in mechanisms to automatically allocate space and schedule requests in ways that minimize seeks and increase transfer sizes. IBM had a much simpler solution: give the users control over how space is allocated to files, and make the users responsible for I/O performance. The resulting volume/file structure is somewhat difficult to use, but it achieves I/O throughput that other file systems can only use as a theoretical maximum.

There are several reasons to study the MVS volume structure:

2. Structural Overview

All file systems include a few basic types of data structures:

In MVS volumes, each of these data structures is stored in a Data Set Control Block (or more commonly a DSCB). Each DSCB is 140 bytes long, and they all reside in a single contiguous area called the Volume Table of Contents (or more commonly the VTOC). The starting address (cylinder, head, record) of the VTOC can be found at the beginning of the third record of the boot track. The size of the VTOC is described in its first DSCB.

Other than the first DSCB (which is always the volume descriptor), most of the remaining DSCBs in the VTOC can be used for any purpose (e.g. file descriptors and free space descriptors). All of the file system meta-data is confined to the DSCBs in the VTOC. Other than the boot track and the VTOC, the entire remainder of the volume is available to be allocated to user files.

Since all of the meta-data in an MVS volume resides in DSCBs, a study of MVS volume structure inevitably turns into a discussion of the various types of DSCBs. For our purposes there are four important types of DSCBs (there are a few others, but they have rather specialized uses). We can tell type a particular DSCB describes by looking at its format byte. The most interesting types are:

Each of these will be discussed in the following sections.

3. Volume Descriptors (Format 4 DSCB)

All file systems need a volume descriptor that describes the size and layout of the file system. MVS stores this information in a format 4 DSCB. There can only be one format 4 DSCB in a volume, and it must be the first DSCB in the VTOC.

The information included in the volume descriptor DSCB includes:

4. Free Space Descriptors (Format 5 DSCBs)

MVS files are made up of a few (e.g. not hundreds) of chunks of space. Each chunk is contiguous space, and can be read or written with little-or-no head motion. Successive chunks may or may not be close to one another. Each of these separately allocated "chunks" is called an extent. Different files can have different extent sizes. The size of each extent, as well as its alignment (e.g. on a track or cylinder boundary) is chosen by the programmer who creates the file.

If I had a database that I expected to occupy somewhere in the range of 5-25 cylinders, I might allocate space to it in single cylinder extents. When processing the entire database, I might then read a whole cylinder at a time (transferring several megabytes, at very high speed, with only a single seek at the start of the operation). This would enable me to process my database very efficiently.

Because different files can use different extent sizes, MVS volumes support variable partition allocation (with coalescing). They keep track of the (currently) free space with a free list comprised of format 5 DSCBs.

The second DSCB in the VTOC is a format 5 DSCB, which is the start of the free space list on that volume. A format 5 DSCB contains up to 26 free extent descriptions. A free extent description includes the address of the next free extent, and its size (in cylinders and tracks). A free extent can be of arbitrary size, but must be contiguous. If there are more than 26 free extents, there is also a pointer to the next format 5 DSCB in the free list.

When a Format 5 DSCB is empty (no longer points to any free extents) it is recycled by changing its format to 0 ... making it available to be reallocated for some other use. If it becomes necessary to add new nodes to the free list (because the appropriate format 5 DSCB is already full) a new (type 0) DSCB is found, turned into a type 5 DSCB, and inserted into the free list.

5. File Descriptors (Format 1 and Format 3 DSCBs)

Each file on an MVS volume is described by a type 1 DSCB. The contents of a type 1 DSCB includes:

MVS doesn't really have directories. Each file has a unique, true name, which is recorded in the file's type 1 DSCB. These names are all 44 characters long. To find a particular file, it is often necessary to read through all of the type 1 DSCBs in the VTOC until the desired file name is found.

A Format 1 DSCB includes only three extent pointers. If a file has more than three extents of space allocated to it, one or more Format 3 DSCBs will also be associated with the file (to describe the additional extents). Each Format 3 DSCB describes up to 13 additional extents (giving the extent number within the file, and its starting and ending addresses). There is also a pointer to the next Format 3 DSCB ... making it possible to allocate as many extents to a file as desired.

When a new file is created, an unallocated (type 0) DSCB is located, and turned into a type 1 DSCB. As extents are added to the file, they are recorded in the Format 1 DSCB. When the fourth extent is added, another unallocated DSCB is found, and initialized into a Format 3 DSCB to describe the next 13 extents.

When a file is deleted, all of its extents are returned to the free list (coalescing as necessary), and the Format 1 DSCB (and any associated Format 3 DSCBs) are cleared (making them available for reallocation).

6. Volume Index

As mentioned previously, MVS does not really have directories. Filenames may be structured in a way that suggests directories (e.g. USERS.KAMPE.CS111.WEB.DOCS.MVS.HTML) but this is only a naming convention.

DSCBs are allocated in a First-Come-First-Served fashion, so there is no guarantee that the Format 1 DSCBs for related files will be anywhere near one another. Since the order of DSCB allocations is unrelated to file names, finding a desired file by searching through all of the Format 1 DSCBs can be a very slow process. Such searches could be made much faster if we prepared a sorted list of file names.

MVS file systems often include an index or catalog to optimize file opens and other common searches. An MVS volume can be designated (in the volume descriptor) as having an index. If there is an index, it is always named "SYS1.VTOCIX". The index (which can be recreated at any time) contains:

The Volume Index Entry Records (or VIERs) make up the majority of the Index, and these are used to create a sorted list of file names. The VIERs exist in a hierarchy that is very much like directories:

There are, however, a few key differences between the Volume Index Entries in the catalog and true directories:

The Volume Index can be deleted at any time, with no harm to the volume. It can be reconstructed any time, by scanning the VTOC and enumerating all of the Format 1 and Format 5 DSCBs.

7. Summary

From the user's perspective, the big difference between MVS volumes and any other file system is the number of decisions that the user is required to make when creating a file. The user has to declare what the file's record structure will be, choose the most appropriate standard access method, estimate the eventual size of the file, and decide what units should be used for space allocation and disk I/O. Very few other operating systems require users to provide so much input into how the file should be created and used ... but the result (of good choices) is file I/O performance that very few other operating systems can match.

In memory allocation, the primary purpose of variable partition allocation was to minimize internal fragmentation. In the case of MVS volumes, the primary purpose is to give users more control over space allocation ... specifically to make it possible to allocate space in large contiguous extents that can be read and written very efficiently.

In memory allocation, we found that variable partition allocation almost inevitably led to external fragmentation. Over short periods, coalescing can counter this trend ... but ultimately external framgentation is likely to win. The same thing is true of MVS volumes. For this reason, MVS data administrators have to back-up and rebuild their file systems occasionally ... to reallocate space more efficiently, and coalesce all of the available free space.

Because the MVS volume structure is primitive (invented in the 1960s) it is simple. All file system meta-data is stored in a pool of fixed size DSCBs, and file names are directly stored in the file descriptors. If we look at the optional index/catalogs (which were added later to enhance performance) we see data structures that are actually quite similar to the bit-map free lists and hierarchical directories of other file systems. IBM found a way to obtain enhanced functionality and performance, while maintaining upwards compatibility with older on-disk data structures. This is a problem that all file system vendors face.

8. References

IBM MVS/DFP V3R3 System Programming Reference
Document SC26-4567-02. IBM Library Server, book IGG3S300