CS 134

Processes and Threads

  • PinkRobot speaking

    CS 105 touched on the basics of processes and threads, but we'll review them here to make sure we're all on the same page.

Processes

A process represents an instance of a program in execution. It's essentially a running program with its own dedicated resources and execution context. When you launch an application on your computer, the operating system creates one or more processes to manage and execute that application.

Another way of viewing a process is as an idealized machine that just exists to run one program. The machine comes into being, runs the program, and then goes away. And, whereas actual machines are aren't very regular, with various different pieces of hardware that work in different ways, our idealized machine is much neater and simpler.

The operating system creates the illusion that each process is its own world (sometimes called isolation), with its own memory, its own CPU, and so on. In reality, because they are running on the same physical machine, the processes have to share the CPU, and the operating system has to manage the sharing of the CPU among the processes.

Key characteristics of a process include

  1. Program Code: The executable instructions of the program (read-only memory).
  2. Data: The variables and data structures used by the program (read–write memory).
  3. Resources: System resources allocated to the process, such as memory, CPU time, and file handles.
  4. Processor state: The state of the CPU registers and program counter.

In addition, the operating system itself may store additional information about the process, including

  1. Id: A unique identifier assigned to the process to allow us to refer to specific processes.
  2. State: The current condition of the process, which can be
    • New: The process is being created
    • Ready: The process is waiting to be assigned to a CPU
    • Running: Instructions are being executed
    • Waiting: The process is waiting for some event to occur
    • Terminated: The process has finished execution
  3. Priority: The relative importance of the process in relation to other processes.

Inside the operating system, many of the details of a process are tracked in a data structure called a process control block (PCB). The PCB contains information about the process, such as the process ID, the state of the process, the program counter, the CPU registers, and information about the process's memory.

Processes operate independently of each other and are isolated by the operating system. This isolation ensures that one process cannot directly access or modify the memory or resources of another process, enhancing system stability and security.

The operating system manages processes through various mechanisms:

  1. Process Scheduling: Determines which process runs on the CPU at any given time.
  2. Context Switching: Saves the state of a running process and loads the state of another process when switching between them.
  3. Inter-Process Communication (IPC): Allows processes to communicate and synchronize their actions.
  4. Memory Management: Allocates and deallocates memory for processes.

Here's some code showing process creation on a Unix-like system:

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>

int main() {
    pid_t pid;
    int status;

    printf("Parent process (PID: %d) is about to fork...\n", getpid());

    pid = fork();

    if (pid < 0) {
        // Fork failed
        perror("fork");
        exit(1);
    } else if (pid == 0) {
        // Child process
        printf("Child process (PID: %d) is about to execute /bin/echo...\n",
               getpid());

        char *args[] = {"/bin/echo", "Hello, World!", NULL};
        execv("/bin/echo", args);

        // If execv returns, it must have failed
        perror("execv");
        exit(1);
    } else {
        // Parent process
        printf("Parent process is waiting for child (PID: %d) to complete...\n",
               pid);

        pid_t wait_result = waitpid(pid, &status, 0);

        if (wait_result == -1) {
            perror("waitpid");
            exit(1);
        }

        if (WIFEXITED(status)) {
            printf("Child process exited with status %d\n",
                   WEXITSTATUS(status));
        } else {
            printf("Child process did not exit normally\n");
        }
    }

    return 0;
}

The code is basically a “Hello World” program, but it says hello by running the /bin/echo program as a child process.

You can run this code via

If you run the program multiple times, what changes in the output?

Threads

In the same way that a physical machine can have multiple cores, a process can have multiple threads. Each thread is a separate flow of control within the process, and that can run independently of the others. Threads share the same memory space and resources within a process, allowing them to communicate and coordinate more easily than separate processes.

In CS 105, you saw the POSIX threads library. Here's some example code using multiple threads:

#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <unistd.h>

#define NUM_BAKERS 3
#define CAKES_TO_WIN 5

pthread_mutex_t oven_mutex = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t oven_cond = PTHREAD_COND_INITIALIZER;
int oven_in_use = 0;
int winner = -1;

void* baker(void* arg) {
    int id = *(int*)arg;
    int cakes_baked = 0;

    while (winner == -1) {
        usleep(rand() % 5000 + 1000);  // Random cake-making time
        pthread_mutex_lock(&oven_mutex);
        while (oven_in_use && winner == -1) {
            printf("Baker %d is waiting for the oven...\n", id);
            pthread_cond_wait(&oven_cond, &oven_mutex);
        }

        if (winner != -1) {
            pthread_mutex_unlock(&oven_mutex);
            break;
        }

        oven_in_use = 1;
        pthread_mutex_unlock(&oven_mutex);

        // Baking a cake
        printf("Baker %d is baking a cake...\n", id);
        usleep(rand() % 3000 + 1000);  // Random baking time

        pthread_mutex_lock(&oven_mutex);
        oven_in_use = 0;
        cakes_baked++;
        printf("Baker %d finished baking! Total cakes: %d\n", id, cakes_baked);

        if (cakes_baked == CAKES_TO_WIN) {
            winner = id;
            printf("Baker %d wins the competition!\n", id);
        }

        pthread_cond_signal(&oven_cond);  // Signal that the oven is free
        pthread_mutex_unlock(&oven_mutex);

        // Small delay to allow other bakers a chance
        usleep(100000);  // 100ms
    }

    return NULL;
}

int main() {
    pthread_t bakers[NUM_BAKERS];
    int baker_ids[NUM_BAKERS];

    srand(time(NULL));

    printf("Welcome to the Pthreads Baking Competition!\n");
    printf("First baker to bake %d cakes wins!\n\n", CAKES_TO_WIN);

    for (int i = 0; i < NUM_BAKERS; i++) {
        baker_ids[i] = i + 1;
        pthread_create(&bakers[i], NULL, baker, &baker_ids[i]);
    }

    for (int i = 0; i < NUM_BAKERS; i++) {
        pthread_join(bakers[i], NULL);
    }

    printf("\nCompetition ended. Baker %d is the winner!\n", winner);

    return 0;
}

You can try running this code using

In this code, oven_in_use and winner are shared variables that are protected by a mutex. The baker function simulates a baker baking cakes, and the main function creates multiple threads to run the baker function. Each baker has their own stack and registers, but they share the same memory space and global variables. (Technically, each baker's cakes_baked variable is in memory shared by all threads, but each thread only knows where to find data on its own stack, not the stacks of other threads, so it's essentially private to each thread.)

Does the same baker always win the competition?

(When logged in, completion status appears here.)