CS153 Fall 2008 Assignment 5

Harvey Mudd College
Computer Science 153
Assignment 5
Due Sunday, November 9, by 11:59pm

Eigenfaces for Recognition -- and maybe detection...

Thanks to S. Lazebnik for the idea and scaffolding for this assignment

The goal of the assignment is to implement a nearest-neighbor face recognition algorithm based on Turk and Pentland's Eigenfaces for Recognition. That paper is not as straightforward a guide to implementation as some prior assignments' source material. There is also an online write-up by one of the original authors (now at UCSB) here at Scholarpedia's Eigenfaces entry.

We will be using a database consisting of 10 images each of 40 different people. The image resolution is 32x32, corresponding to an extrinsic appearance-space dimension of 1024. 7 images of each of the 40 people will be used for training; the other 3 will be used for testing. The task is to correctly identify each of the test images using a low-dimensional eigenface representation. As with other assignments, you should choose a direction to extend this assignment: face-detection in images is one possibility. It's described below.

Homework materials

Here are the materials, including the face database (from here), training and test indices, and sample code.

For this assignment, I would encourage you to work in Matlab, because it has a single call to extract principal components from a data set. Those principal components are simply another name for the eigenfaces: a basis along which the variance of the data is maximized.

If you do choose to do this hw with OpenCV, that alone would certainly count as an extra feature (as well as the assignment itself).

Directions

Read over the provided code (named hw5_template.m) - or, at least, the first part of it. It begins by loading the face database and partitioning it into training and test sets. As mentioned, we will be using 7 images per person for training and 3 images for testing.
Run that test script -- you should see the 100 faces that are the result of the display_faces call near the top of the file. Modify this call so that you see all of the faces in the training and test sets (combined, they're named fea).
Perform PCA on the training faces and extract top K components. Use the MATLAB princomp function. (In OpenCV, you'll want to use cvCalcCovarMatrix.)
Find the mean face among the 280 training images (in Matlab: it's mean). It becomes easy to subtract off the mean from all 280 images at once by creating lots of means in a single matrix with repmat.
Use display_faces to show the top 5, 25, and 100 eigenfaces. (Include these in your write-up.)
Choose a value for K (later this will vary - I started with 20). Compute the K-dimensional projection of the training images (after removing the mean!) onto the eigen-face space. For each training face you will obtain K coefficients: one for each of the top K eigenfaces. Then, reconstruct the faces using only the K top eigenfaces. (Remember to add back the mean face!) Display the first 100 of these reconstructed faces (and include that visualization in your write-up).
Next, run the same procedure on the test faces. That is, project them onto the K eigenfaces of face space. With those coefficients, reconstruct the test faces as a linear combination of the K top eigenfaces. Include a snapshot of at least some of these reconstructed test faces in your write-up.
For each test image, find the training image that is "closest" (in the sense of Euclidean distance) to the test image in the face space, and assign the label (person index) of the training image to the test image. Use the names nn_ind and estimated_label for the one-dimensional column vectors representing the index of the nearest training neighbor and the label of the nearest training neighbor for the test image whose row index is i.
The provided code will print a classification rate for the chosen value of K (the number of eigenfaces used).
The code at the bottom of hw5_template.m will produce a number of batch-visualizations in which each misclassified face is labeled as such. You may want to create a copy of this script without that visualization in function form, where the input to the function is K and the output is the (percentage) classification rate. You should plot classification success vs. K for at least 10 values of K (though not necessarily uniformly chosen -- see the write-up guidelines, below).

What to hand in

The picture of the mean face and the top 5, 25, and 100 eigenfaces computed by PCA (principal component analysis).
A plot of the nearest-neighbor classification rate as a function of K. You can choose any sampling of at least 10 values K from 1 to 1024, as long as it captures the trend of how classification performance changes as a function of K (i.e., we expect performance to be poor for extremely low K, but then to rise very rapidly and level off at some point).
Include at least two pictures of incorrectly classified faces and their nearest neighbors from the training set: one at K = 10, one at K = 100. The sample code can be easily adapted to do this.
Include a short description of any problems you encountered along the way.

Extensions -- face detection

Extract a few 32x32 patches from non-face grayscale images and investigate their behavior when they are projected onto the ``face space.'' Project these patches onto the face space for different values of K, and display the reconstructed versions of these patches, along with the reconstruction errors. Compare the reconstruction errors to those for face images from the database. You can also try it with 32x32 face patches not from the database. Try to decide what threshold on the reconstruction error can distinguish between face and non-face images (for a K of your choice.)
Either in Matlab or OpenCV, write a program that loads an image and then searches for faces, showing where the top three most "face-like" regions are in the image. You will want to scale any image you use so that it matches the 32x32 face size of the database you built above. Alternatively, you could use Matlab's imresize function in a loop and find the best three faces at any scale.
Other ideas are welcome, too -- perhaps adapt your face recognition to handle other object from a DB you create or find available oneline... Or, you could build a face-database-creation interface that allows you to click on the nose of a face and it stores a square region around that click in a suitable format. If you'd like to use the images we took in class, they are available from this link.

But only one feature is required to earn the credit for writing an extension.
Whichever path you choose, be sure to share your progress and results in your write-up.

Good luck with this hw #5!

Harvey Mudd College Computer Science 153 Assignment 5 Due Sunday, November 9, by 11:59pm