CS153 Fall 2008 Assignment 4

Harvey Mudd College
Computer Science 153
Assignment 3
Demo due Friday, October 24, by 4:30pm
Write-up due Sunday, October 26th, by 11:59pm

Automatic image mosaicking

Thanks to A. Efros for the inspiration (and details!) of this assignment

Goals

For this assignment, you will create a system that will automatically extract and match features between two images. Then, you'll draw on your work from assignment 3 to create a mosaic from those images. As an additional feature, you might extend your system to handle more than two images automatically. Other bells and whistles are suggested below.

The end-goal of this HW #4 is to create an interface that allows the user to specify two images. The system then should

extract Harris corners (see the Matlab code provided below)
determine a subset of the Harris corners to use
compute a feature descriptor for each of those corners
match those features bewteen two images
use a method robust to outliers (RANSAC) to compute the best resulting homography
create the resulting mosaic

The paper

This project is an implementation of part of the paper Multi-Image Matching using Multi-Scale Oriented Patches by Brown et al. (2005). However, we will make several simplifications. Read the description below and then look over the paper, making sure you understand the sections this project asks you to implement! We will also discuss some of these techniques in more detail in class.

Tasks

extracting corners Start with the Harris Interest Point (corner) Detector (Section 2). We won'tt worry about muti-scale - rather, we will use only the highest-resolution scale on the initial image Also, don't worry about sub-pixel accuracy. Re-implementing Harris is a thankless task - so you can use Alyosha Efros's sample code (for those using matlab): harris.m. OpenCV has an API call cvCornerHarris that populates an output image with each pixel's Harris corner strength. Use a neighborhood size of 7 and the default block size of 3. You will need to keep only local maxima, pixels where the corner strength is greater than at any of the eight neighbors. Also, omit corners found in the outermost 22 columns and rows of the image.

culling the features Implement Adaptive Non-Maximal Suppression (Section 3). Keep the 500 features with the greatest radii of support.

describing each feature Compute a 64-value descriptor for each feature that remains after the adaptive non-maximal suppression. Don't worry about rotation-invariance - just extract axis-aligned 8x8 patches. Note that it's extremely important to sample these patches from a surrounding 40x40 window to have a nice big blurred descriptor. I used the pixel values with horizontal coordinates [ x-21, x-15, x-9, x-3, x+3, x+9, x+15, and x+21 ] along with the same spacing vertically to gather the 64 values. Your system should handle changes in intensity (sometimes called "bias/gain-normalization") by subtracting the mean and dividing by the standard devision of the 64 values. This will result in 64-vectors whose mean is 0.0 and whose variance is 1.0. (The matlab function for finding the standard deviation of a vector v is std(v,1).) Also, just use pixel values; we won't implement the wavelet-indexing approach.

matching features Implement Feature Matching (Section 5). That is, you will need to find pairs of features that look similar and are thus likely to be good matches. If you're using matlab, you may find dist2.m useful for fast distance computations. For thresholding, use the simpler approach due to Lowe of thresholding on the ratio between the first and the second nearest neighbors -- consult Figure 6b in the paper for picking the threshold. Section 6 does not need to be part of your implementation. Finally, you should implement the 4-point RANSAC as described in class to compute a robust homography estimate between the two input images.

Note on homographies: If RANSAC chooses four points in which three (or all four) are very close to being collinear, the homography created from their correspondences will be (almost) singular - and the results will become numerically too large/infinite. There are many ways to handle this, but one reasonable way is to check the area of the four triangles that can be formed from the four points that RANSAC chooses. If the smallest of those four areas is less than a threshold, simply throw out that set of four points. To be as robust as possible, your code should check these areas in both of the two images being matched, though it almost always suffices to check only one.

Note to OpenCV users: OpenCV has some idiosyncrasies in how it computes homographies and uses homographies to transform individual points, as you will want to do in this case. (It's less idiosyncratic in how it uses them to warp images.) For computing homographies from four or more pairs of corresponding points, use cvFindHomography. For applying the resulting homography to individual points, you can use cvPerspectiveTransform. The code at this link demonstrates how to create, assign, and use the data structures needed for these API calls.

Note to Matlab users: Matlab's maketform function only allows 4 corresponding points in creating a homography. You may use the script at this link in order to create the best homography from more than four points (best in the least-squared-error sense). You will also need this helper function in the same directory . An example of how to use these two matlab functions appears at this link.

Creating the mosaic Finally, use your work from the previous assignment in order to output a mosaicked image that includes the pixels of both of the input images.

Write-up

In your write-up, be sure to include pictures of the intermediate results for two overlapping images of your choice (perhaps two from hw #3), as well as the final mosaic created. The intermediate results include

the Harris corners
the 500 corners preserved after adaptive non-maximal suppression
the top 50 matches in each of the two images (just show the corners, it's OK to leave it to the imagination which corners matched which)
the top 5 matches in each of the two images (please use different colors for corresponding points so that the matches are clear!)
four of the inlier matches after RANSAC has run
the resulting mosaic

In addition, you should include a brief description of how you approached the pieces of the algorithm and where any difficulties arose.

You should include an archive file of your code, as well as the results from at least two successful runs and one unsuccessful run, along with an explanation of why the unsuccessful run did not work.

Extra features

For full credit, you should include at least one extra feature for this week's assignment - it can be of your own choosing or a variant on one of the ideas below. If you do, be sure to include an example in your write-up and an explanation of the results!

Enable your system to handle more than two images at once. Here, it will have to decide which images match with which and will need to choose one coordinate system to warp the other images into.
Create a system that determines the "odd image out." That is, when provided with four images -- three of the same scene and one of a different scene, your system should be able to cluster them into the appropriate groups of 3 and 1. Credit for this idea to Sesame Street's "Which one of these is not like the others?"
Add multiscale processing for corner detection and feature description
Add rotation invariance to the descriptors
Implement panorama-stitching or panorama-recognition, in which the images wrap around to form a loop.

Demo and write-up

You should demo your system by Friday, October 24th by 4:30 pm.

Be sure to complete your write-up and link it into the CS 153 wiki by 11:59pm on Sunday, 10/24.

Harvey Mudd College Computer Science 153 Assignment 3 Demo due Friday, October 24, by 4:30pm Write-up due Sunday, October 26th, by 11:59pm