CS153 Fall 2000 Assignment 4

Harvey Mudd College
Computer Science 153
Assignment 4
Due Monday, November 13, by midnight

Back to Assignment 4, top-level page

Section 1: Hough Transform

The Hough transform is a general-purpose method for finding contours with known parameterizations within an image. In this problem, you will be seeking out circles, rather than lines. (Although the extensions suggest more possibilities.)

Part 1: Warm-up with synthetic circles

In the directory

/cs/cs153/Images/a4/synthetic

are several synthetic images of circles with varying amounts of noise added. The name of each image indicates the contents of that image with the following conventions: the name circ3.40.tif is an image containing 3 circles, each of which has radius 40 pixels. (The presence or absence of noise is not indicated in the name.)

Using the Hough transform, design a system that identifies the circles in these synthetic images and overlays a ring of colored pixels around the edges of those circles. In creating your system, consider the following requirements:

You will need to choose an edge detector. The m-file edg.m offers several different edge detectors. Use help edg for information on each of those edge detectors and its parameters.

You will need to decide how to accumulate evidence in parameter space. This includes dividing that space into bins and determining which bins to increment (and by how much) for each edge pixel in the original image.

Once the accumulation is complete, you need to get the locations of the maxima. The matlab built-in functions max and findmax may be useful for this.

For suggestions of approaches on many of these points, see the section "Things to Consider," below.

Results:

Part 2: CD ID

In the directory

/cs/cs153/Images/a4/cds

are several images of CDs taken under various states of occlusion by other objects and each other. Since they are real images, noise is present, and edge detection will not be perfect.

Adapt your circle-finding and highlighting code from the previous section to identify CDs from images. Your code should estimate the number of CDs in an input image, as well as the locations of their centers. You should add some kind of graphical annotation (a dot in the center, a radius from center to edge, a ring encircling the CDs) to indicate where and how many objects were found in the image.

Keep in mind that there will likely be far more circles than CDs in any given image. Also, the centers of some of the CDs may be outside the image. Again, see the list near the end of the assignment for suggestions on handling the parameter space effectively.

Results:

Part 3: Accounting for Change

Imagine building a vision system to handle transactions at a newsstand or a lunch counter. Among the many problems facing your system is the ability to identify and count change. In particular, the change may be partially overlapping (to handle the fully overlapping case, you'd need an arm or an actuator of some sort) and it may be on a surface with considerable distractions in the background (a magazine or newspaper, for instance).

This problem asks you to extend your previous two Hough transforms to output the value of the change visible in an image. In the /cs/cs153/Images/a4/change directory there are several pictures including coins -- try your algorithm out on them. You can assume that the pictures will only have to handle pennies, nickels, dimes, and quarters. However, you can't be sure how far from the camera the coins will be, so that their sizes will not necessarily be known beforehand.

Certainly consider using more information than just the contours provided by the Hough transform for this problem, e.g., color would be an important cue for distinguishing nickels from pennies. Using the relative sizes of the coins (which is known and fixed) is also fair game.

Results:

Things to Consider

As discussed in class, there are a number of factors that contribute to the succes of a Hough transform.

To gain additional accuracy in your localization of circles (or any contour, in general), consider backprojecting the hypothesized curve and then "tweaking" it a bit by fitting it to the edge pixels. For a line, that would mean finding the least-squares (or least-inertia) line that fits the edge pixels near the hypothesized line. For a circle, it would mean simply averaging the coordinates of the pixels near the hypothesized circle to find a better estimate of its center.

Because of error introduced by discretizing parameter space, it often helps to blur the values that are being accumulated there. This, in effect, lets an edge feature contribute evidence not only for curves (circles) going directly though that point, but also for curves passing nearby. This can be done either as evidence is being accrued (by adding to neighboring bins) or afterward, by smoothing accumulator space. The latter might be done by convolving with a Gaussian or another smoothing mask. Matlab's function conv2 does arbitrary 2d convolution. Matlab's fspecial function can create Gaussian masks of any size and any standard deviation.

The distribution of "votes" in parameter space can be complex enough that multiple responses can arise from a single image curve. There are a couple of methods to handle this that you may want to try:
- Backmapping refers to performing the Hough transform twice. First, a parameter space is filled with evidence, as usual, with each edge pixel (or feature) contributing votes along some curve in parameter space. The second transform uses a new parameter space. This time each edge pixel (or feature) only contributes votes to one bin, rather than to all of the bins along some curve in parameter space. What happens is that the bins along that curve are searched in the first parameter space and in order to find the bin with the maximum evidence. In the new parameter space, that bin receives all of the votes of the edge pixel under consideration.
- Alternatively, the Hough transform can be performed once. When the global maximum is found in parameter space, that curve is backprojected into the original image. Then, all of the edge responses on or near (within some threshold) of that curve remove their votes from parameter space. This can be done by recomputing where they would vote and subtracting, rather than adding. This is faster (at the expense of more memory) if each pixel keeps a record of how it distributed its votes; then, the subtraction is immediate.
Both of these techniques lead to sharper peaks in the accumulator, and thus easier and more robust contour-finding.

The discussion so far has been of binary feature points casting binary votes. Edge images usually have binary feature points because of the thresholding done during edge detection. In many cases it is more effective to use the strength of the gradient magnitude to weight the votes cast in parameter space. You can find the gradient magnitude by estimating the partial derivative of the intensity image in the x- and y-directions and finding the length of the resulting 2x1 vector.

Possible Extensions

RoboRef
Try to specialize your change-counting system to determine whether the coins are heads or tails. This might be useful for starting football games, for instance.

Line finding
Using the alternative parameterization ("normal" form, r = x*cos(theta) + y*sin(theta) that we discussed in class, find and display the dominant straight-line edges of the images in /cs/cs153/Images/a4/lines. Then, use the edge image (and/or the intensity image) to crop those lines so that they are only drawn where they have visual support (rather than across the entire intensity image). This will provide a mathematically concise description of the (straight-line) edges of the objects in the scenes; those edges, in turn, can be used for reconstructing the 3d geometry of the scene (more on that later, however).

Generalized Hough transform
The Hough transform requires that the curves being sought have a known parameterization. However, it is possible to use it for consistent 2d shapes even if their parameterization is unknown, by constructing an implicit representation through its boundary points.

This is known as the generalizes Hough transform. Basically, given a 2d shape that you are seeking in images, choose any arbitrary reference point on that shape. Find the vector to that reference point from each point on the boundary of the shape and store all of those vectors in a table, called an R table.

Then, assuming translation but no scaling and no rotation, an edge pixel contributes a vote to each parameter space bin that is a displacement of that pixel by some vector in the R table. If scaling and rotation are allowed, the R-table vectors contribute votes for each orientation and scale considered.
With a shape of your own choosing, implement the generalized Hough transform (at least for translation-only) -- for example, this would be an alternative method for finding the "squiggle" shapes in the game of Set.

The Hough transform -- in action
As always, I'm interested in display tools for showing the process behind visual algorithms, for example in a tool that shows the sequential changes to parameter space -- say, as the pixels in the edge image changed color to indicate that they have been used.

Keep in mind, too, that an extension need not be any of these things. In fact, it need not have to relate to the Hough transform. If you'd like, you can hand in as an extension some preliminary coding/results on your final project. (See the next section for some project possibilities.)

Harvey Mudd College Computer Science 153 Assignment 4 Due Monday, November 13, by midnight

Back to Assignment 4, top-level page

Section 1: Hough Transform

Part 1: Warm-up with synthetic circles

Part 2: CD ID

Part 3: Accounting for Change

Things to Consider

Possible Extensions

On to Section 2: Final project proposal

Harvey Mudd College
Computer Science 153
Assignment 4
Due Monday, November 13, by midnight