CS182a Spring 2011 Assignment 3

Harvey Mudd College
Computer Science 182a
Assignment 3
Due Friday, March 11, by 11:59pm

Image stitching and auto-mosiacking: 2d visual geometry

Thanks to A. Efros for the inspiration for this assignment!

Goals

For this assignment, you will create image mosaics and composites, first from hand-selected features and then from automatically-selected features. There are many opportunities for extensions -- either using robots and/or the Kinect or in exploring additional capabilities with the 2d image geometry that is at the heart of this project.

Part 1: hand-crafted homographies

This part of the assignment is, in part, data-collection and groundwork for the automated image-stitching application in part 2. You may use OpenCV, Matlab, or PIL - or another system, if you'd like.

Image-gathering For this hw, you should take (or find online) at least five images and post them on your write-up webpage. In order to best illustrate the 2d geometry of images, there are a few constraints on the images you take:
- a predominantly planar image, at an angle One of the images should be of a predominantly planar scene, e.g., with a building facade, ground-plane image, interior wall, floor, or ceiling, or some other mostly planar subject. Many times we naturally take such pictures so that the plane being imaged is parallel to the camera's image plane, i.e., we "look directly at it." However be sure you avoid this here -- the image's plane should be facing in a different direction than the camera. Your goal will then be to rectify the image so that it looks as if you'd taken it head-on.
- four overlapping images from a single point The other four images may be of any scene, but should have substantial (~30-50%) pairwise overlap. In addition, they should be taken from the same place -- the camera will rotate from shot to shot, but you should try to keep the camera's translational motion near zero. Because the ultimate goal will be to auto-stitch these images together, it may help to turn off autofocus and autoexposure, if that is possible on your camera. If not, you might get very interesting image mosaics -- and this is OK, too! Your four images should be taken from four points of view that form, roughly, the vertices of a rectangle. That is, two of them should be "above" the other two. Also, make sure that the four images overlap both horizontally and vertically: this will ensure there are enough features to match from image to image.
- You may certainly take more images than only four! In particular, an extension of your auto-stitching system could include building a full panorama, which would require many more images. Also, you may want to take lots of images and choose the "best" four for this and the next assignment.

Tasks for Part 1

Planar warping of a known quadrilateral The first image-warping piece of this assignment will use your predominantly planar image. Choose four corners of the image of square or rectangle, whose image coordinates will not form a rectangle if the scene was viewed "at an angle" as mentioned above. Then, create a 3x3 homography H that maps the pixel coordinates of those corners into a rectangle of appropriate aspect ratio. Finally, warp the entire image according to that homography H. In your write-up you should include the raw image, the 3x3 H, and the resulting "unwarped" image.

Image compositing In addition, you should create a composite image that warps one image into a quadrilateral within a second image. This warping should use an appropriate homography so that the composite "looks right" geometrically in the target image. You might use your source image from above or other images entirely. The choice of the subject and the target is entirely up to you: there are a number of interesting composites you might create... . For example, you could put fake graffiti on buildings or chalk drawings on the ground (taken from other images) -- or you could replace a road sign with a personal portrait - or spam!

You should use the capabilities built in to OpenCV or Matlab to do this. In OpenCV,

cvGetPerspectiveTransform will take in correspondences and produce the 3x3 homography between the two sets of points. A perspective transformation is another name for a homography.
cvWarpPerspective will apply a homography (such as obtained from the above function) that warps a source image to a destination version.
Details on both are available at this reference page and some source code to use as a starting point is available here.

and in Matlab, you should use the built-in help routine to investigate the functions

cp2tform, imtransform, tformarray, tformfwd, tforminv, and cpselect. In particular, cpselect and imtransform will do most of the work here.

A two-image, hand-sitched mosaic From your set of four overlapping images taken from the same point, choose two of the images and hand-select the pixel-coordinates of four corresponding points between the two images. Then, create at least two mosaics from the two images:

One that remaps the second image into the coordinate system of the first -- and on top of the first.
One that remaps the first image into the coordinate system of the second -- and on top of the second.

The key here is creating a function or sequence of functions that places the images into the same pixel plane, i.e., coordinate system. The three images at the top of this page (on the left) are an example of this hand-built mosaicking run on two images of Sprague.

There are a number of interesting features you might include in your mosaic:

You might create a mosaic by spatially blending images taken at different times (day vs. night) or during different seasons -- presumably ones you already have or find elsewhere!
You could create a mosaic by spatially blending a historic photograph with a modern picture of the same place.
Or, try building another interesting/bizarre mosaic, e.g., one with multiple copies of the same person at different locations.

Write-up

In your write-up for this part, be sure to include

The original images -- please scale them down to a reasonable display size, but make sure that the entire images are available for download.
The planar homography example, including a visualization of the points chosen, the estimated 3x3 homography, and the resulting fronto-parallel image.
The hand-stitched mosaics from the pair of images you chose, along with a visualization of the corresponding points chosen in each of the two images. The mosaics are (1) the second image remapped to the first's coordinate system and (2) the first image remapped to the second's images coordinate system.

Part 2: automatically-matched mosaics

This part of the assignment automates the procedure that was human-driven above. In particular, you'll build a system that takes in two images. With those images the system then should

extract Harris corners (see the code provided below)
determine a subset of the Harris corners to use
compute a feature descriptor for each of those corners
match those feature descriptors bewteen two images
use a method robust to outliers (RANSAC) to compute the best resulting homography
create the resulting mosaic

Our approach will follow the "MOPs" paper, i.e., Multi-Image Matching using Multi-Scale Oriented Patches by Brown et al. (2005). Our implementatione will make a few simplifications. Read the description below and then look over the paper, making sure you understand the sections this project asks you to implement! We will also discuss some of these techniques in more detail in class.

Tasks for Part 2

extracting corners Start with the Harris Interest Point (corner) Detector (Section 2). We won't worry about muti-scale - rather, we will use only the highest-resolution scale on the initial image Also, don't worry about sub-pixel accuracy. Re-implementing Harris is a thankless task - so you can use Alyosha Efros's sample code (for those using matlab): harris.m. OpenCV has an API call cvCornerHarris that populates an output image with each pixel's Harris corner strength. I used a "block" size of 7 (the size of the patch with which the Harris matrix is computed) and an aperture size of 3 (the size of the derivative operator's patch). Feel free to adjust as necessary. This note provides some example code that might be helpful in setting up your cvCornerHarris call. You will need to keep only local maxima, i.e., pixels where the corner strength is greater than at any of the eight neighbors. Also, omit corners found in the outermost 22 columns and rows of the image.

culling the features Implement Adaptive Non-Maximal Suppression (Section 3 in the paper). Keep the 500 features with the greatest radii of support.

describing each feature Compute a 64-value descriptor for each feature that remains after the adaptive non-maximal suppression. Don't worry about rotation-invariance - just extract axis-aligned 8x8 patches. Note that it's extremely important to sample these patches from a surrounding 40x40 window to have a nice big, blurred descriptor. I used the pixel values with horizontal coordinates [ x-21, x-15, x-9, x-3, x+3, x+9, x+15, and x+21 ] along with the same spacing vertically to gather the 64 values. Your system should be made insensitive to changes in intensity (sometimes called "bias/gain-normalization") by subtracting the mean and dividing by the standard devision of the 64 values. This will result in 64-component vectors whose mean is 0.0 and whose variance is 1.0. (The matlab function for finding the standard deviation of a vector v is std(v,1).) Also, just use pixel values; we won't implement the paper's wavelet-indexing approach.

matching features Implement Feature Matching (Section 5). That is, you will need to find pairs of features between the two images that look similar and are thus likely to be good matches. If you're using matlab, you may find dist2.m useful for fast distance computations. For thresholding, use the simpler approach due to Lowe of thresholding on the ratio between the first and the second nearest neighbors -- consult Figure 6b in the paper for picking the threshold. (Section 6 does not need to be part of your implementation.) Finally, you should implement the 4-point RANSAC as described in class to compute a robust homography estimate between the two input images.

Note on homographies: If RANSAC chooses four points in which three (or all four) are very close to being collinear, the homography created from their correspondences will be (almost) singular - and the results will become numerically too large/infinite. There are many ways to handle this, but one reasonable way is to check the area of the four triangles that can be formed from the four points that RANSAC chooses. If the smallest of those four areas is less than a threshold, simply throw out that set of four points. To be as robust as possible, your code should check these areas in both of the two images being matched, though it almost always suffices to check only one.

Note to OpenCV users: OpenCV has some idiosyncrasies in how it computes homographies and uses homographies to transform individual points, as you will want to do in this case. (It's less odd in how it uses them to warp images.) For computing homographies from four or more pairs of corresponding points, use cvFindHomography. For applying the resulting homography to individual points, you can use cvPerspectiveTransform. The code at this link demonstrates how to create, assign, and use the data structures needed for these API calls.

Note to Matlab users: Matlab's maketform function only allows 4 corresponding points in creating a homography. You may use the script at this link in order to create the best homography from more than four points (best in the least-squared-error sense). You will also need this helper function in the same directory . An example of how to use these two matlab functions appears at this link.

Creating the mosaic Finally, combine these steps with your work from Part 1 in order to output a mosaicked image that includes the pixels of both of the input images: auto-mosaicking!

Write-up

In your write-up for this part, be sure to include pictures of the intermediate results for two overlapping images of your choice, as well as the final mosaic created. The intermediate results include

the Harris corners
the 500 corners preserved after adaptive non-maximal suppression
the top 50 (or so) ratio-distance matches in each of the two images (just show the corners, it's OK to leave it to the imagination which corners matched which)
at least four of the inlier matches after RANSAC has run
your resulting mosaic!

In addition, you should include a brief description of any detours or personalized design decisions you made -- and where any difficulties arose, if any.

Please include an archive file of all of your code, as well as the results from at least one successful run and one unsuccessful run, along with an explanation of why the unsuccessful run did not work!

Extensions

For full credit, you should include at least one extension for this project's assignment - it can be of your own choosing or a variant on one of the ideas below. If you do, be sure to include an example in your write-up and an explanation of the results!

Make further progress with a longer-term project you are considering for this course, e.g.,
- A robot-based system you're working on (or starting...)
- A kinect-based system you're working on (again, you could start one as well)

Alternatively, if you're interested in extending the auto-mosaicking itself, that is certainly a possibility, e.g.,
- Enable your system to handle more than two images at once. Here, it will have to decide which images match with which and will need to choose one coordinate system to warp the other images into.
- Create a system that determines the "odd image out." That is, when provided with four images -- three of the same scene and one of a different scene, your system should be able to cluster them into the appropriate groups of 3 and 1. Credit for this idea to Sesame Street's "Which one of these is not like the others?"
- Add multiscale processing for corner detection and feature description
- Add rotation invariance to the descriptors
- Implement panorama-stitching or panorama-recognition, in which the images wrap around to form a loop.
- Other possibilities - or combinations with previous projects - are possible: mosaic-carving, perhaps?

Harvey Mudd College Computer Science 182a Assignment 3 Due Friday, March 11, by 11:59pm

Image stitching and auto-mosiacking: 2d visual geometry

Goals

Part 1: hand-crafted homographies

Tasks for Part 1

Write-up

Part 2: automatically-matched mosaics

Tasks for Part 2

Write-up

Extensions

Harvey Mudd College
Computer Science 182a
Assignment 3
Due Friday, March 11, by 11:59pm