Computer Systems Performance Analysis

CS 147 is intended to teach you how to evaluate the performance of computer systems in various ways. The concentration will be on experimental methods, but I am not promising that I will restrict myself to experimentation as the only technique. Topics that will certainly be covered in the course include:

- Types of measurement techniques and how to select them
- Types of workloads and how to select and characterize them
- Common mistakes in performance measurement and benchmarking
- Review of probability and statistics
- Methods of statistical analysis
- Experimental design techniques
- Methods of data presentation
- Methods of measuring data in real computer systems

- Discrete-event simulation techniques
- Random-number generation
- Matching simulation models to real computer systems
- Basic queueing theory
- Markov processes
- M/M/k queues

When you complete this course, you should be able to:

- Identify mistakes in popular benchmarks
- Design a robust and meaningful experimental measurement
- Analyze the validity of someone else's performance claims
- Statistically analyze the results of an experiment or simulation
- Present data in a clear and cohesive fashion

The prerequisites listed in the catalog are Math 62 and CS 70. However, I won't be enforcing the Math 62 prereq because it's a new course. On the other hand, I recommend CS 110, CS 131, and CS 140 as reasonable corequisites, since in your project you will be working with complex software systems. If you try to take CS 147 having had only CS 70 and with no other CS courses in the same term, the project will probably give you trouble.

The text book for this course is Raj Jain, *The Art of Computer
Systems Performance Analysis*, Wiley, 1991. Even if you don't
take the course, I strongly recommend this book to anyone who does experimental
computer science.

There will be a small number of homework assignments early in the course, to provide you with practice in statistical analysis. There will also be either one or two exams (probably both a midterm and a final).

The primary graded component of the course will be a project in performance analysis. Depending on the number of students, the projects will probably be be done by small teams of 2-4 people. I expect that most people will do some sort of experimental measurement, though other options can be proposed.

Each team will present their project to the class at the end of the term, and will critique the other teams' projects.

Examples of typical projects might include:

- Measure and characterize the load on the CS network as a function of assignment due dates,
- Measure and contrast the performance of several Linux file systems.
- Compare the performance of an Apple laptop to a PC-based one.
- Identify the impact of an MP3 player on the behavior of a compiler.
- Explain (experimentally) why g++3 is so much slower than g++2.
- Characterize how Turing's scheduler responds in practice to
various
`nice`

values. - Characterize the performance of several database systems.
- Measure the performance of a Web site under various conditions.
- Implement several heuristics for the Traveling Salesman Problem and compare their performance.
- (If simulation is covered) Create a simulation of the serving lines at Platt, and propose solutions to the crowding problems.

*© 2003, Geoff Kuenning*

This page is maintained by Geoff Kuenning.