Link to Home

If this page looks abnormally plain, you should consider upgrading to a standards-compliant browser.

Links to other sections of this site appear at the bottom of the page.

Clinic Projects

Since joining the department, I've supervised three clinic projects. The two most recent projects have been strongly research oriented. I'll supervise my sixth project in the fall of 2006.


“Clinic” is Harvey Mudd College's capstone experience in computer science and engineering. A team of students spend a year working a project defined by a company or organization. Each team has a faculty advisor and a liaison from the sponsoring organization to provide information and resources. The results of the project become the property of the sponsor.

The Clinic concept was invented by the college's engineering department in the early 1960's. Since that time, over a thousand clinic projects have been conducted at the college by students in various departments. The computer science department's tenth aniversary of Clinic participation occurred in 2003.

2005–2006: Document Classification

I hope to get a chance to post the details here soon...

2004–2005: Text Classification

I hope to get a chance to post the details here soon...

2003–2004: Text Classification

The sponsor for this project was NeonGecko, a small startup company in Seattle. NeonGecko's core businesses are ecommerce sites, such as, but the company also is interested in providing chat sites, suck as NeonGecko needed assistance in the automated management of these sites, particularly in the area of automated content classification.

The NeonGecko clinic team, developed two tools to aid NeonGecko in their management of thousands of community websites --- unmoderated websites filled almost entirely with user-supplied content. The team created a post-classification system uses supervised-learning techniques to determine the topic (or topics) of a given post. Using this system, NeonGecko administrators can identify misplaced posts and learn where these misplaced posts likely belong. The team also created a topic-discovery system, powered by unsupervised-learning techniques, that can be used by NeonGecko administrators to search for new and emerging topics among their corpus of posts.

2002–2003: Audio Classification

The sponsor for this project was Auditude, a small startup company founded by a Mudd alumnus. Auditude's key asset is in the field of music recognition. From a short sample of a piece of music, they can generate a fingerprint that can be used to identify the recording (provided that the recording has previously been fingerprinted and that fingerprint stored in their database).

Auditude came to the college wishing to go beyond their existing technology. Their existing fingerprint scheme was specific to a particular recording—recordings that sounded virtually identical from the a human perspective could have vastly different fingerprints. Thus, they wanted to investicate other aspects of music similarity. For example, if a computer system can automatically determine whether two recordings are similar, it can assist in managing a music collection and make recommendations for possible new acquisitions.

The Auditude clinic team investigated content-based similarity relationships between recorded musical performances. Because similarity is a complex concept involving many judgments, the team has focused their attention on a combination of rhythm, timbre, and apparent loudness. They developed software that extracts these features from a recording, and uses them to organize music. The system can be used to categorize music, to make recommendations, and generate playlists that arrange music in a sequence with smooth transitions between songs.

2001–2002: Distributed Computing

The sponsor for this project was United Devices, a company founded by the creators of both and the SETI@home project. They wished to show how their MetaProcessor Platform could provide a powerful distributed computing solution for the bioinformatics community.

The clinic team ported two bioinformatics applications—RepeatMasker and Phylip—to the MetaProcessor Platform. Both applications were widely used in the bioinformatics community and are very computationally intensive. In the first semester the team familiarized themselves with the MetaProcessor API and made significant progress on porting RepeatMasker to the MetaProcessor Platform. In the second semester they completed the RepeatMasker port and ported some of the programs in the Phylip package.

Return to Top of Page