Harvey Mudd College

Parallel and Real-Time Computation

Fall 2009

Instructor:

Robert Keller 253 Olin (office hours 4-5 p.m. M-W, or as found in the office, which is most of the time), keller AT cs DOT hmc DOT edu, x 18483

Trailer:

 

It is a thrill to make a computer solve an interesting problem. It is a bigger thrill to get N >> 1 computers to cooperate in solving a single problem. As the speed of processors is now reaching an ultimate brick-wall (as determined by the speed of light), it is increasingly important to be able to use multiple computers, or multiple processors (ÒcoresÓ) within a single computer, to work together effectively, for purposes of speedup and reliability. We study this issue from three angles: algorithms, architecture, and programming languages. We will construct some programs on actual parallel processors. Then we discuss a related issue: models and languages for computation in real-time, where deadlines, rather than mere speedup are important.

Catalog Description:

Characteristics and applications of parallel and real-time systems. Specification techniques, architectures, languages, design, and implementation. 3 credit hours.

Prerequisites: CS 105 and 140 (131 recommended), or permission of instructor.

Requirements and Grading:

For grading, we will be using a ÒCommunity of ScholarsÓ model. This assumes that you are in the course primarily to learn, and thus you will be a regular and diligent participant. You will take part in discussions, give presentations, and bring in new discoveries and questions. Students who participate fully will receive a grade of ÒAÓ. Students who participate less will receive a lesser grade. There might be an exam about 2/3 of the way through, but if all is going well, maybe we can do without it. Activities will include:

Lecture Slides (posted as we go):

 

Topic Outline:
(I reserve the right to modify the order and emphasis. For our mutual convenience, I have put in links for a few key words, mostly to wikipedia articles.)

 

Historical overview

Concurrent computing | Parallel computing

Top 500 Supercomputer Sites

Added dimensions over serial/sequential computing

Computation Structures and Flavors of Parallelism

Data parallelism

Task parallelism

Threads

Fibers, coroutines

Processes

Pipelining, Instruction-Level Parallelism

Bit-Level Parallelism

Models and Architectures

Cellular Automata

FlynnÕs Taxonomy | Multiprocessing

SIMD (Single Instruction Stream, Multiple Data Stream)

SMP, Shared-Memory MIMD (Multiple I-Stream, Multiple D-Stream)

ASMP

Distributed MIMD

SPMD (Single-Program, Multiple Data)

Barrel Processor, Hyper-threading, HEP

Vector Processor, Array Processor, Systolic Arrays

VLIW

Associative Processor Array

Actor Model

Logic Programming: AND vs. OR Parallelism

PRAMs: CREW, CRCW, EREW

XMT (Explicit Multi-Threading): PRAM-on-Chip Visions: CREW, CRCW, EREW

Uzi Vishkin's comments on PRAMs

VRAM (basis of NESL)

LogP machine

BSP (Bulk Synchronous Parallel)

NUMA, ccNUMA, COMA, SN (Shared Nothing)

Speedup, AmdahlÕs Law, GustafsonÕs Mitigation

Efficiency, Karp-Flatt Metric

BrentÕs Theorem: Reducing number of processors

Overheads

Context-switching

Latency-hiding

Contention

Load-Balancing, Load-Sharing, Load-Stealing

Correctness Issues

Determinacy, Races

Sequential Consistency

Serializability

Deadlock/Starvation/Livelock

Software Lockout

Software Transactional Memory

Priority Inversion

Validation

Simulation

Correctness proofs

Model-checking

Monte-Carlo

Architectures

Transputer

nCUBE

Connection Machine

Multi-core

Communication Structures

Network topologies

Crossbar

Clos Network

Granularity considerations

Caching issues

Scalability, Iso-Efficiency

Data Structures and Mapping

Domain Decomposition

Patterns

Synchronization

Fork&Join

Barrier

Rendezvous, CSP (Cooperating Sequential Processes)

Producer/Consumer

Locking, Mutual Exclusion, Test-and-Set, Fetch-and-Add

Semaphores (The Little Book of Semaphores)

Spinlocks, contention, queuing

Time stamping

Bakery Algorithm

Readers/Writers

Monitors

Serializers

Blackboard, Linda, Tuple spaces, Object Spaces, JavaSpaces, etc.

Non-blocking synchronization

Range-Splitting, Map/Reduce (slides)

Prefix Sum or Scan, Pointer jumping, List Ranking

Butterfly

Libraries

MPI (Message Passing Interface)

TBB (Intel Thread Building Blocks)

OpenMP

Charm++

STAPL (Standard Template Adaptive Parallel Library)

HPC++ (Parallel Standard Template Library)

Titanium (Java Dialect)

Programming Language Models

Java Threads, Posix Threads, C# Threads

UPC (Unified Parallel C)

Erlang

Star-P (for Matlab and NumPy)

Scala

Distributed Models

Clusters, Grid Computing, Globus

Cloud computing

Compilation Issues

Register renaming

Memory Issues

Cache Coherence

Memory Reclamation

Transactional Memory

Algorithms/Applications

Search

Sorting

Databases

Graph algorithms

Graphics: Ray tracing

Matrix computations: Jacobi, SOR, Red-Black

N-body problems

QuadTrees, Barnes-Hut, Fast Multipole Method

Particle-In-Cell Computations

Multigrid Method

Ant Colony Optimization, Swarm Intelligence

Timewarp Simulation

Fault Tolerance

Real-Time

Additional correctness measures

Scheduling Algorithms

EDF (Earliest Deadline First) Scheduling

Rate-Monotonic Scheduling

Real-Time Operating Systems

Priority Protocols

Temporal Logic

Model-Checking

 

Reading Resources

 

It is not required that you buy a book. However, if you wish to do so, the first one listed is most recommended, followed by the second.

 

Patterns for Parallel Programming, by Timothy G. Mattson, et al., Addison-Wesley (2004).

 

The Art of Multiprocessor Programming, by Maurice Herlihy & Nir Shavit, Morgan Kaufmann (2008).

Using MPI - 2nd Edition, by William Gropp, et al. MIT Press (1999).

Intel Threading Building Blocks, by James Reinders, OÕReilly (2007).

 

Java Concurrency in Practice, by Brian Goetz, Addison-Wesley (2006).

 

 

Web Resources

 

Thes are just a few starting points. YouÕll easily find a lot more on the web.

 

I will also post some reading assignments from literature on the web.

 

Parallel Computing Works (on-line book)

MPI (Message Passing Interface) | MPI Tutorial (html)

 

Notre Dame MPI Tutorial Pt. 1 | Pt. 2 | Pt. 3

 

HMC CS Tealia Cluster | Rocks Overview | (HMC Bio Altanius Cluster)

 

Minicourse on Multithreaded Programming | The Cilk Project

 

Posix Threads

Parallel Patterns

High Performance Computing and Communications Glossary

Designing and Building Parallel Programs (Ian Foster book)

Slides for Gramma, Gupta, Karypis, and Kumar book