Harvey Mudd College
Computer Science 153
Assignment 5
Due Sunday, November 9, by 11:59pm
Eigenfaces for Recognition -- and maybe detection...
Thanks to S. Lazebnik for the idea and scaffolding for this assignment
The goal of the assignment is to implement a nearest-neighbor face recognition algorithm based on Turk and Pentland's Eigenfaces for Recognition. That paper is not as
straightforward a guide to implementation as some prior assignments' source material. There is also an online write-up by one
of the original authors (now at UCSB) here at Scholarpedia's
Eigenfaces entry.
We will be using a database consisting
of 10 images each of 40 different people. The image resolution is 32x32, corresponding to an extrinsic appearance-space
dimension of 1024. 7 images of each of the 40 people will be used for training; the other 3 will be used for testing.
The task is to correctly identify each of the test images using a low-dimensional
eigenface representation. As with other assignments, you should choose a direction to extend this assignment:
face-detection in images is one possibility. It's described below.
Homework materials
Here are the materials, including the face database
(from here), training and test indices,
and sample code.
For this assignment, I would encourage you to work in Matlab, because it has a single call to
extract principal components from a data set. Those principal components are simply another name for the
eigenfaces: a basis along which the variance of the data is maximized.
If you do choose to do this hw with OpenCV, that alone would certainly count as an extra feature (as well
as the assignment itself).
Directions
- Read over the provided code (named hw5_template.m) - or, at least, the first part of it.
It begins by loading the face database and partitioning it into training and test sets. As mentioned,
we will be using 7 images per person for training and 3 images for testing.
- Run that test script -- you should see the 100 faces that are the result of the display_faces
call near the top of the file. Modify this call so that you see all of the faces in the
training and test sets (combined, they're named fea).
- Perform PCA on the training faces and extract top K components. Use the MATLAB
princomp function. (In OpenCV, you'll want to use
cvCalcCovarMatrix.)
- Find the mean face among the 280 training images (in Matlab: it's mean).
It becomes easy to subtract off the mean from all 280 images at once by creating
lots of means in a single matrix with repmat.
- Use display_faces to show the top 5, 25, and 100 eigenfaces. (Include these in your write-up.)
-
Choose a value for K (later this will vary - I started with 20).
Compute the K-dimensional projection of the training images (after removing the mean!) onto the eigen-face space.
For each training face you will obtain K coefficients: one for each of the top K eigenfaces.
Then, reconstruct the faces using only the K top eigenfaces. (Remember to add back the mean face!)
Display the first 100 of these reconstructed faces (and include that visualization in your write-up).
-
Next, run the same procedure on the test faces. That is, project them onto the K eigenfaces of
face space. With those coefficients, reconstruct the test faces as a linear combination of the K
top eigenfaces. Include a snapshot of at least some of these reconstructed test faces in your write-up.
- For each test image, find the training image that is "closest" (in the sense of Euclidean distance)
to the test image in the face space, and assign the label (person index) of the training image
to the test image. Use the names nn_ind and estimated_label for the one-dimensional column vectors
representing the index of the nearest training neighbor and the label of the nearest training neighbor
for the test image whose row index is i.
- The provided code will print a classification rate for the chosen value of K (the number of
eigenfaces used).
- The code at the bottom of hw5_template.m will produce a number of batch-visualizations in which
each misclassified face is labeled as such. You may want to create a copy of this script without that
visualization in function form, where the input to the function is K and the output
is the (percentage) classification rate. You should plot classification success vs. K for at least
10 values of K (though not necessarily uniformly chosen -- see the write-up guidelines, below).
What to hand in
- The picture of the mean face and the top 5, 25, and 100 eigenfaces computed by PCA (principal component analysis).
- A plot of the nearest-neighbor classification rate as a function of K. You can choose any sampling of at least 10
values K from 1 to 1024,
as long as it captures the trend of how classification performance changes as a function of K (i.e., we expect
performance to be poor for extremely low K, but then to rise very rapidly and level off at some point).
- Include at least two pictures of incorrectly classified faces and their nearest neighbors from the
training set: one at K = 10, one at K = 100.
The sample code can be easily adapted to do this.
- Include a short description of any problems you encountered along the way.
Extensions -- face detection
- Extract a few 32x32 patches from non-face grayscale images and
investigate their behavior when they are projected onto the ``face space.'' Project these patches
onto the face space for different values of K, and display the reconstructed versions of these patches,
along with the reconstruction errors. Compare the reconstruction errors to those for face images from
the database. You can also try it with 32x32 face patches not from the database. Try to decide what
threshold on the reconstruction error can distinguish between face and non-face images (for a K
of your choice.)
-
Either in Matlab or OpenCV, write a program that loads an image and then searches for
faces, showing where the top three most "face-like" regions are in the image. You
will want to scale any image you use so that it matches the 32x32 face size of the
database you built above. Alternatively, you could use Matlab's imresize
function in a loop and find the best three faces at any scale.
- Other ideas are welcome, too -- perhaps adapt your face recognition to handle other object
from a DB you create or find available oneline... Or, you could build a face-database-creation
interface that allows you to click on the nose of a face and it stores a square region around
that click in a suitable format. If you'd like to use the images we took in class, they
are available from this link.
But only one feature is required to earn the credit
for writing an extension.
- Whichever path you choose, be sure to share your progress and results in your write-up.
Good luck with this hw #5!