CS 134

File Descriptors in POSIX Systems

  • BlueRobot speaking

    Correctly implementing file descriptors is one component of Assignment 4. So it's good for you to have how they work really clear in your mind when you're coding.

Let's review what we know so far about file descriptors, from what we saw in Assignment 2 (where you used them to read from and write to files), and from the preceeding section on resource identifiers.

  • A file descriptor is a small, non-negative integer that refers to an open file.
  • The open system call returns a file descriptor, or -1 on error.
    • open always uses the lowest available file descriptor.
  • The close system call ends the life of a file descriptor, and can usually be thought of as closing the file.
  • For regular files, file descriptors have an offset that indicates the current position in the file where the next read or write will occur.
    • The lseek system call can be used to change the file offset.
  • dup/dup2 duplicates a file descriptor, returning a new file descriptor that refers to the same open file.
    • dup/dup2 is not the same as opening a new file descriptor to the same file, because the new file descriptor shares the same file offset as the original.
    • dup is like open in that it uses the lowest available file descriptor.
    • dup2 allows you to specify the file descriptor number to use; if you specify a file descriptor that is already open, dup2 closes the existing file descriptor first. If you give the same file descriptor number as the original, dup2 is a no-op.
  • File descriptors are local to a process; so, for example, file descriptor 3 in one process is not the same as file descriptor 3 in another process, except that…
  • The fork system call creates a new process that is an exact copy of the calling process, including all open file descriptors.
    • So file descriptor 3 in the parent process is the same as file descriptor 3 in the child process!
    • However, if after the fork, either the parent or the child changes the file offset of its file descriptor 3, the other process's file offset is not affected.

Let's take a look at a program that demonstrates these concepts. Take some time to look it over and think about what it will do.

#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>
#include <time.h>
#include <string.h>
#include <sys/wait.h>
#include <sys/types.h>
#include <sys/stat.h>

#define NO_STR  "NO!"
#define YES_STR "YES"
#define BYE_STR "BYE"

#define TEMP_FILENAME "descriptor-demo.tmp"

int
main(void)
{
        int fd, dup_fd;
        pid_t pid;
        char buffer[5];

        /* Create and write to temporary file */
        fd = open(TEMP_FILENAME, O_CREAT | O_RDWR | O_TRUNC | O_EXCL, 0644);
        if (fd == -1) {
                perror("open");
                exit(1);
        }
        dup_fd = dup(fd);
        write(fd, NO_STR, 4);
        write(dup_fd, YES_STR, 4);
        close(dup_fd);
        write(fd, BYE_STR, 4);

        /* Move back to zero offset */
        lseek(fd, 0, SEEK_SET);

        pid = fork();
        if (pid == -1) {
                perror("fork");
                exit(1);
        } else if (pid == 0) {
                /* Child process */
                usleep(250000);  /* Sleep for 0.25 seconds to let parent seek */
                printf("Child  - Reading from the file\n");
                read(fd, buffer, 4);
                printf("Child  - Did child see parent's seek: '%s'\n", buffer);
        } else {
                /* Parent process */
                printf("Parent - Seeking to offset 4\n");
                lseek(fd, 4, SEEK_SET);

                usleep(500000);  /* Sleep for 0.5 seconds to let child read */
                read(fd, buffer, 4);
                printf("Parent - Read: '%s'\n", buffer);
                if (strcmp(buffer, "BYE") == 0) {
                        printf("Parent - Thus, child's read changed offset!\n");
                } else {
                        printf("Parent - Thus, offset did not change.\n");
                }
                waitpid(pid, NULL, 0); /* Wait for child to finish */
        }

        close(fd);
        unlink(TEMP_FILENAME);  /* Remove temporary file */
        return 0;
}

This program

  • Creates a temporary file, writes some data to it, and then forks.
  • The parent process seeks to offset 4 in the file, waits, and then reads from the file and prints the data it read.
  • The child process waits, reads from the file, and then prints the data it read.

Think about these questions:

  • When we create dup_fd with dup(fd), what's happening? Do fd and dup_fd share the same underlying “open file” object in the kernel and thus have the same file offset?
    • If they do, YES_STR will be written after NO_STR in the file.
    • If they don't, YES_STR will overwrite NO_STR.
  • What will the child process read from the file?
  • What will the parent process read from the file?

When you think you have answers to these questions,

What did the child read?

What did the parent read?

Were those results what you'd expected? Can you see why the program gave these results when it ran?

  • Cat speaking

    One thing I hadn't really realized until I saw this code was that close doesn't actually close the file. It just ends the life of the file descriptor. The file itself is still open until all file descriptors that refer to it are closed. We can see that in the code, where we close dup_fd and then continue write to and read from fd because the file is still open.

  • Duck speaking

    So are you going to tell us exactly how to design the file descriptor part of Assignment 4?

  • PinkRobot speaking

    No. With what you've seen in both this page and the previous ones, you should have all then information you need.

  • Dog speaking

    And we can just say a fixed limit on the number of file descriptors a process can have open, right?

  • PinkRobot speaking

    Yes.

(When logged in, completion status appears here.)