Harvey Mudd College
Parallel and Real-Time Computation
Fall 2009
Robert Keller 253 Olin (office hours
4-5 p.m. M-W, or as found in the office, which is most of the time), keller AT
cs DOT hmc DOT edu, x 18483
Trailer:
It is a thrill to
make a computer solve an interesting problem. It is a bigger thrill to get N
>> 1 computers to cooperate in solving a single problem. As the speed of
processors is now reaching an ultimate brick-wall (as determined by the speed
of light), it is increasingly important to be able to use multiple computers,
or multiple processors (ÒcoresÓ) within a single computer, to work together
effectively, for purposes of speedup and reliability. We study this issue from
three angles: algorithms, architecture, and programming languages. We will
construct some programs on actual parallel processors. Then we discuss a
related issue: models and languages for computation in real-time, where
deadlines, rather than mere speedup are important.
Characteristics
and applications of parallel and real-time systems. Specification techniques,
architectures, languages, design, and implementation. 3 credit hours.
Prerequisites: CS 105 and 140 (131
recommended), or permission of instructor.
For grading, we
will be using a ÒCommunity of ScholarsÓ model. This assumes that you are in the
course primarily to learn, and thus you will be a regular and diligent
participant. You will take part in discussions, give presentations, and bring
in new discoveries and questions. Students who participate fully will receive a
grade of ÒAÓ. Students who participate less will receive a lesser grade. There
might be an exam about 2/3 of the way through, but if all is going well, maybe
we can do without it. Activities will include:
Topic Outline:
(I reserve the right to modify the order and emphasis. For our mutual
convenience, I have put in links for a few key words, mostly to wikipedia articles.)
Concurrent computing
| Parallel computing
Added dimensions over
serial/sequential computing
Computation
Structures and Flavors of Parallelism
Processes
Pipelining, Instruction-Level
Parallelism
Models and
Architectures
FlynnÕs Taxonomy | Multiprocessing
SIMD (Single
Instruction Stream, Multiple Data Stream)
SMP,
Shared-Memory MIMD (Multiple
I-Stream, Multiple D-Stream)
Distributed MIMD
SPMD (Single-Program, Multiple
Data)
Barrel Processor, Hyper-threading, HEP
Vector Processor,
Array Processor, Systolic
Arrays
Associative Processor
Array
Logic Programming:
AND vs. OR Parallelism
PRAMs: CREW,
CRCW, EREW
XMT (Explicit Multi-Threading): PRAM-on-Chip Visions: CREW,
CRCW, EREW
Uzi Vishkin's comments on PRAMs
VRAM (basis of NESL)
BSP (Bulk
Synchronous Parallel)
NUMA, ccNUMA, COMA, SN (Shared
Nothing)
Speedup, AmdahlÕs Law, GustafsonÕs Mitigation
Efficiency, Karp-Flatt Metric
BrentÕs Theorem:
Reducing number of processors
Overheads
Contention
Load-Balancing,
Load-Sharing, Load-Stealing
Correctness Issues
Determinacy,
Races
Validation
Simulation
Correctness proofs
Model-checking
Monte-Carlo
Architectures
Communication
Structures
Network topologies
Granularity
considerations
Caching issues
Data Structures and
Mapping
Domain Decomposition
Patterns
Fork&Join
Barrier
Rendezvous, CSP
(Cooperating Sequential Processes)
Locking, Mutual Exclusion, Test-and-Set, Fetch-and-Add
Semaphores
(The Little Book of Semaphores)
Spinlocks, contention, queuing
Time stamping
Bakery Algorithm
Readers/Writers
Serializers
Blackboard, Linda,
Tuple spaces, Object
Spaces, JavaSpaces, etc.
Range-Splitting, Map/Reduce (slides)
Prefix Sum or Scan, Pointer
jumping, List Ranking
Butterfly
Libraries
MPI (Message
Passing Interface)
TBB (Intel Thread Building
Blocks)
STAPL (Standard Template Adaptive Parallel Library)
HPC++ (Parallel Standard Template Library)
Titanium (Java Dialect)
Programming Language
Models
Java
Threads, Posix Threads, C# Threads
Star-P (for
Matlab and NumPy)
Distributed Models
Clusters, Grid Computing, Globus
Compilation Issues
Memory Issues
Memory Reclamation
Search
Sorting
Databases
Graph algorithms
Graphics: Ray tracing
Matrix computations:
Jacobi, SOR, Red-Black
N-body problems
QuadTrees, Barnes-Hut, Fast Multipole Method
Particle-In-Cell
Computations
Ant Colony
Optimization, Swarm
Intelligence
Fault Tolerance
Real-Time
Additional
correctness measures
EDF
(Earliest Deadline First) Scheduling
Reading Resources
It is not required
that you buy a book. However, if you wish to do so, the first one listed is
most recommended, followed by the second.
Patterns
for Parallel Programming, by Timothy G. Mattson, et al., Addison-Wesley
(2004).
The
Art of Multiprocessor Programming, by Maurice Herlihy & Nir Shavit,
Morgan Kaufmann (2008).
Intel
Threading Building Blocks, by James Reinders, OÕReilly (2007).
Java
Concurrency in Practice, by Brian Goetz, Addison-Wesley (2006).
Web
Resources
Thes are just a few
starting points. YouÕll easily find a lot more on the web.
I will also post
some reading assignments from literature on the web.
Parallel Computing Works
(on-line book)
MPI (Message Passing Interface)
| MPI
Tutorial (html)
Notre Dame MPI
Tutorial Pt. 1 | Pt. 2 | Pt. 3
HMC CS Tealia Cluster
| Rocks
Overview | (HMC
Bio Altanius Cluster)
Minicourse on
Multithreaded Programming | The
Cilk Project
High Performance Computing and Communications Glossary
Designing and Building Parallel Programs (Ian Foster book)