An Introduction to MVS Volumes and File Structure

An Introduction to MVS Volume and File Structure

Mark Kampe markk@cs.ucla.edu

1. Introduction

The basic disk volume structure that IBM uses in its server class systems is remarkably unchanged since it was first introduced in the 1960s. We can liken it to a shark. Sharks are relatively primitive creatures, who have been swimming our oceans since 100 million years before the first dinosaurs. How have such "primitive" creatures survived the changes and cataclysms that have destroyed countless thousands of other species? Simple: they are very good at what they do.

Since the dawn of disks, performance has been a simple matter of minimizing seeks, and doing the largest possible transfers. Much of the complexity in modern file systems is found in mechanisms to automatically allocate space and schedule requests in ways that minimize seeks and increase transfer sizes. IBM had a much simpler solution: give the users control over how space is allocated to files, and make the users responsible for I/O performance. The resulting volume/file structure is somewhat difficult to use, but it achieves I/O throughput that other file systems can only use as a theoretical maximum.

There are several reasons to study the MVS volume structure:

It is still used on many large server-class systems.
It is very different from the file systems used by the other major operating systems.
It is one of the only major file systems to use variable partition allocation.
It illustrates performance/ease-of-use trade-offs.
It is a good case study in generality, how a very simple file system can provide the building blocks for applications to implement exactly the types of files they need.

2. Structural Overview

All file systems include a few basic types of data structures:

bootstrap
code to be loaded into memory and executed when the computer is powered on. MVS volumes reserve the entire first track of the first cylinder for the boot strap.
volume descriptors
information describing the size, type, and layout of the file system ... and in particular how to find the other key meta-data descriptors.
file descriptors
information that describes a file (ownership, protection, time of last update, etc.) and points where the actual data is stored on the disk.
free space descriptors
lists of blocks of (currently) unused space that can be allocated to files.
file name descriptors
data structures that user-chosen names with each file.

In MVS volumes, each of these data structures is stored in a Data Set Control Block (or more commonly a DSCB). Each DSCB is 140 bytes long, and they all reside in a single contiguous area called the Volume Table of Contents (or more commonly the VTOC). The starting address (cylinder, head, record) of the VTOC can be found at the beginning of the third record of the boot track. The size of the VTOC is described in its first DSCB.

Other than the first DSCB (which is always the volume descriptor), most of the remaining DSCBs in the VTOC can be used for any purpose (e.g. file descriptors and free space descriptors). All of the file system meta-data is confined to the DSCBs in the VTOC. Other than the boot track and the VTOC, the entire remainder of the volume is available to be allocated to user files.

Since all of the meta-data in an MVS volume resides in DSCBs, a study of MVS volume structure inevitably turns into a discussion of the various types of DSCBs. For our purposes there are four important types of DSCBs (there are a few others, but they have rather specialized uses). We can tell type a particular DSCB describes by looking at its format byte. The most interesting types are:

0xF1: format 1 DSCBs describe files
0xF3: format 3 DSCBs describe additional extents in files
0xF4: format 4 DSCBs describe an MVS volume
0xF4: format 5 DSCBs describe free space

Each of these will be discussed in the following sections.

3. Volume Descriptors (Format 4 DSCB)

All file systems need a volume descriptor that describes the size and layout of the file system. MVS stores this information in a format 4 DSCB. There can only be one format 4 DSCB in a volume, and it must be the first DSCB in the VTOC.

The information included in the volume descriptor DSCB includes:

The number of DSCBs in the VTOC
The size and location of the alternate tracks area
These are spare tracks to be held in reserve until they are needed. In the event that a bad-spot develops on the disk, alternate tracks can be given to any affected files in place of the tracks that have gone bad.
File system type and status information.
Key format information (for indexed volumes)
VSAM data base parameters (for VSAM volumes)

4. Free Space Descriptors (Format 5 DSCBs)

MVS files are made up of a few (e.g. not hundreds) of chunks of space. Each chunk is contiguous space, and can be read or written with little-or-no head motion. Successive chunks may or may not be close to one another. Each of these separately allocated "chunks" is called an extent. Different files can have different extent sizes. The size of each extent, as well as its alignment (e.g. on a track or cylinder boundary) is chosen by the programmer who creates the file.

If I had a database that I expected to occupy somewhere in the range of 5-25 cylinders, I might allocate space to it in single cylinder extents. When processing the entire database, I might then read a whole cylinder at a time (transferring several megabytes, at very high speed, with only a single seek at the start of the operation). This would enable me to process my database very efficiently.

Because different files can use different extent sizes, MVS volumes support variable partition allocation (with coalescing). They keep track of the (currently) free space with a free list comprised of format 5 DSCBs.

The second DSCB in the VTOC is a format 5 DSCB, which is the start of the free space list on that volume. A format 5 DSCB contains up to 26 free extent descriptions. A free extent description includes the address of the next free extent, and its size (in cylinders and tracks). A free extent can be of arbitrary size, but must be contiguous. If there are more than 26 free extents, there is also a pointer to the next format 5 DSCB in the free list.

When a Format 5 DSCB is empty (no longer points to any free extents) it is recycled by changing its format to 0 ... making it available to be reallocated for some other use. If it becomes necessary to add new nodes to the free list (because the appropriate format 5 DSCB is already full) a new (type 0) DSCB is found, turned into a type 5 DSCB, and inserted into the free list.

5. File Descriptors (Format 1 and Format 3 DSCBs)

Each file on an MVS volume is described by a type 1 DSCB. The contents of a type 1 DSCB includes:

the file name (44 bytes long)
the number of extents allocated to the file
the number of valid data bytes in the last extent
the extent size to be used when new extents are allocated
pointers to the first three extents allocated to this file
(optional) pointer to an additional Format 3 DSCB.
creation, last referenced, and expiration dates for the file
the "organization" and "record format" of the file
Since all of the file structure (below the level of the extents) is controlled by the application, this field allows applications to record a note (for themselves) on how this file has been organized and what size the logical records have..
parameters for the appropriate "access method"
Most programmers do not directly program their own disk I/O. Rather, they choose from a set of standardized "access methods" (e.g. VSAM). In order to know how to read and write a file, several access method specific parameters may also have to be specified by the programmer who creates the file.

MVS doesn't really have directories. Each file has a unique, true name, which is recorded in the file's type 1 DSCB. These names are all 44 characters long. To find a particular file, it is often necessary to read through all of the type 1 DSCBs in the VTOC until the desired file name is found.

A Format 1 DSCB includes only three extent pointers. If a file has more than three extents of space allocated to it, one or more Format 3 DSCBs will also be associated with the file (to describe the additional extents). Each Format 3 DSCB describes up to 13 additional extents (giving the extent number within the file, and its starting and ending addresses). There is also a pointer to the next Format 3 DSCB ... making it possible to allocate as many extents to a file as desired.

When a new file is created, an unallocated (type 0) DSCB is located, and turned into a type 1 DSCB. As extents are added to the file, they are recorded in the Format 1 DSCB. When the fourth extent is added, another unallocated DSCB is found, and initialized into a Format 3 DSCB to describe the next 13 extents.

When a file is deleted, all of its extents are returned to the free list (coalescing as necessary), and the Format 1 DSCB (and any associated Format 3 DSCBs) are cleared (making them available for reallocation).

6. Volume Index

As mentioned previously, MVS does not really have directories. Filenames may be structured in a way that suggests directories (e.g. USERS.KAMPE.CS111.WEB.DOCS.MVS.HTML) but this is only a naming convention.

DSCBs are allocated in a First-Come-First-Served fashion, so there is no guarantee that the Format 1 DSCBs for related files will be anywhere near one another. Since the order of DSCB allocations is unrelated to file names, finding a desired file by searching through all of the Format 1 DSCBs can be a very slow process. Such searches could be made much faster if we prepared a sorted list of file names.

MVS file systems often include an index or catalog to optimize file opens and other common searches. An MVS volume can be designated (in the volume descriptor) as having an index. If there is an index, it is always named "SYS1.VTOCIX". The index (which can be recreated at any time) contains:

a free track bit-map for the entire volume
a free DSCB bit-map for the VTOC
a structured and sorted list of file names
a bit-map for free entries in the index file

The Volume Index Entry Records (or VIERs) make up the majority of the Index, and these are used to create a sorted list of file names. The VIERs exist in a hierarchy that is very much like directories:

the first VIER is for the top level index, and contains a sorted list of entries.
if there are more file names (at this level) than can fit in a single VIER, there is a pointer to a continuation VIER.
if a VIER entry describes a file, the entry includes a pointer to the format 1 DSCB for the file.
if a VIER entry describes another sub-index, it includes a pointer to the first VIER for that sub-index.

There are, however, a few key differences between the Volume Index Entries in the catalog and true directories:

A directory is usually a file in its own right, whereas all of the VIERs on a volume are records in a single file.
On file systems with directories, the directories are essential to locating a file. In MVS volumes, the catalog is not necessary to open a file ... it merely speeds up the process.
The VIERs do not assign names to files. They merely record (in a more efficiently searchable form) the file names that are assigned in Format 1 DSCBs).

The Volume Index can be deleted at any time, with no harm to the volume. It can be reconstructed any time, by scanning the VTOC and enumerating all of the Format 1 and Format 5 DSCBs.

7. Summary

From the user's perspective, the big difference between MVS volumes and any other file system is the number of decisions that the user is required to make when creating a file. The user has to declare what the file's record structure will be, choose the most appropriate standard access method, estimate the eventual size of the file, and decide what units should be used for space allocation and disk I/O. Very few other operating systems require users to provide so much input into how the file should be created and used ... but the result (of good choices) is file I/O performance that very few other operating systems can match.

In memory allocation, the primary purpose of variable partition allocation was to minimize internal fragmentation. In the case of MVS volumes, the primary purpose is to give users more control over space allocation ... specifically to make it possible to allocate space in large contiguous extents that can be read and written very efficiently.

In memory allocation, we found that variable partition allocation almost inevitably led to external fragmentation. Over short periods, coalescing can counter this trend ... but ultimately external framgentation is likely to win. The same thing is true of MVS volumes. For this reason, MVS data administrators have to back-up and rebuild their file systems occasionally ... to reallocate space more efficiently, and coalesce all of the available free space.

Because the MVS volume structure is primitive (invented in the 1960s) it is simple. All file system meta-data is stored in a pool of fixed size DSCBs, and file names are directly stored in the file descriptors. If we look at the optional index/catalogs (which were added later to enhance performance) we see data structures that are actually quite similar to the bit-map free lists and hierarchical directories of other file systems. IBM found a way to obtain enhanced functionality and performance, while maintaining upwards compatibility with older on-disk data structures. This is a problem that all file system vendors face.