Harvey Mudd College
Computer Science 153
Assignment 4
Due Monday, November 13, by midnight
Section 1: Hough Transform
The Hough transform is a general-purpose method for finding
contours with known parameterizations within an image.
In this problem, you will be seeking out circles, rather than
lines. (Although the extensions suggest more possibilities.)
Part 1: Warm-up with synthetic circles
In the directory
/cs/cs153/Images/a4/synthetic
are several synthetic images of circles with varying amounts of
noise added. The name of each image indicates the contents of
that image with the following conventions: the name circ3.40.tif
is an image containing 3 circles, each of which has radius 40
pixels. (The presence or absence of noise is not indicated in the name.)
Using the Hough transform, design a system that identifies
the circles in these synthetic images and overlays a ring of colored pixels
around the edges of those circles. In creating your system,
consider the following requirements:
- You will need to choose an edge detector. The m-file
edg.m offers several different edge detectors. Use help edg
for information on each of those edge detectors and its parameters.
- You will need to decide how to accumulate evidence in parameter
space. This includes dividing that space into bins and determining which
bins to increment (and by how much) for each edge pixel in the original image.
- Once the accumulation is complete, you need to get the
locations of the maxima. The matlab built-in functions max
and findmax
may be useful for this.
For suggestions of approaches on many of these points, see the
section "Things to Consider," below.
Results:
Part 2: CD ID
In the directory
/cs/cs153/Images/a4/cds
are several images of CDs taken under various states of
occlusion by other objects and each other. Since they are real
images, noise is present, and edge detection will not be perfect.
Adapt your circle-finding and highlighting code from the previous
section to identify CDs from images. Your code should estimate the number
of CDs in an input image, as well as the locations of their centers.
You should add some kind of graphical annotation (a dot in the center,
a radius from center to edge, a ring encircling the CDs) to indicate
where and how many objects were found in the image.
Keep in mind that there will likely be far more circles than CDs
in any given image. Also, the centers of some of the CDs may be outside
the image. Again, see the list near the end of the assignment
for suggestions on handling the parameter space effectively.
Results:
Part 3: Accounting for Change
Imagine building a vision system to handle transactions at
a newsstand or a lunch counter. Among the many problems facing
your system is the ability to identify and count change. In particular,
the change may be partially overlapping (to handle the fully overlapping
case, you'd need an arm or an actuator of some sort) and it may be on
a surface with considerable distractions in the background (a magazine
or newspaper, for instance).
This problem asks you to extend your previous two Hough transforms to
output the value of the change visible in an image. In the /cs/cs153/Images/a4/change
directory there are several pictures including coins -- try your
algorithm out on them. You can assume that the pictures will only have
to handle pennies, nickels, dimes, and quarters. However, you can't be sure
how far from the camera the coins will be, so that their sizes will not
necessarily be known beforehand.
Certainly consider using more information than just the contours provided
by the Hough transform for this problem, e.g., color would be an important
cue for distinguishing nickels from pennies. Using the relative sizes of
the coins (which is known and fixed) is also fair game.
Results:
Things to Consider
As discussed in class, there are a number of factors that contribute
to the succes of a Hough transform.
-
To gain additional accuracy in your localization of circles (or any
contour, in general), consider backprojecting the hypothesized curve
and then "tweaking" it a bit by fitting it to the edge pixels.
For a line, that would mean finding the least-squares (or least-inertia)
line that fits the edge pixels near the hypothesized line. For a circle,
it would mean simply averaging the coordinates of the pixels near the
hypothesized circle
to find a better estimate of its center.
- Because of error introduced by discretizing parameter space,
it often helps to blur the values that are being accumulated there.
This, in effect, lets an edge feature contribute evidence not
only for curves (circles) going directly though that point, but
also for curves passing nearby. This can be done either as evidence
is being accrued (by adding to neighboring bins) or afterward, by
smoothing accumulator space. The latter might be done by
convolving with a Gaussian or another smoothing mask. Matlab's
function conv2 does arbitrary 2d convolution.
Matlab's fspecial function can create Gaussian masks
of any size and any standard deviation.
- The distribution of "votes" in parameter space can be complex enough
that multiple responses can arise from a single image curve. There are
a couple of methods to handle this that you may want to try:
- Backmapping refers to performing the Hough transform
twice. First, a parameter space is filled with evidence, as usual,
with each edge pixel (or feature) contributing votes along
some curve in parameter space.
The second transform uses a new parameter space. This time each edge pixel
(or feature) only contributes votes to one bin,
rather than to all of the bins along some curve in parameter space.
What happens is that the bins along that curve are
searched in the first parameter space and in order to find the bin
with the maximum
evidence. In the new parameter space, that bin receives all
of the votes of the edge pixel under consideration.
- Alternatively, the Hough transform can be performed once. When the
global maximum is found in parameter space, that curve is backprojected
into the original image. Then, all of the edge responses on or near
(within some threshold) of that curve remove their votes from
parameter space. This can be done by recomputing where they
would vote and subtracting, rather than adding. This is
faster (at the expense of more memory) if each pixel keeps a record
of how it distributed its votes; then, the subtraction is immediate.
Both of these techniques lead to sharper peaks in the accumulator, and thus
easier and more robust contour-finding.
-
The discussion so far has been of binary feature points casting binary votes.
Edge images usually have binary feature points because of the thresholding
done during edge detection. In many cases it is more effective
to use the strength of the gradient magnitude to weight the votes
cast in parameter space. You can find the gradient magnitude by estimating
the partial derivative of the intensity image in the x- and y-directions
and finding the length of the resulting 2x1 vector.
Possible Extensions
- RoboRef
Try to specialize your change-counting
system to determine whether the coins are heads or tails.
This might be useful for starting football games, for instance.
- Line finding
Using the alternative parameterization
("normal" form, r = x*cos(theta) + y*sin(theta) that
we discussed in class, find and display the dominant straight-line
edges of the images in /cs/cs153/Images/a4/lines.
Then, use the edge image (and/or the intensity image) to
crop those lines so that they are only drawn where they have
visual support (rather than across the entire intensity image).
This will provide a mathematically concise description of the
(straight-line) edges of the objects in the scenes; those edges,
in turn, can be used for reconstructing the
3d geometry of the scene (more on that later, however).
- Generalized Hough transform
The Hough transform requires
that the curves being sought have a known parameterization.
However, it is possible to use it for consistent 2d shapes
even if their parameterization is unknown, by constructing
an implicit representation through its boundary points.
This is known as the generalizes Hough transform.
Basically, given a 2d shape that you are seeking in images,
choose any arbitrary reference point on that shape. Find the
vector to that reference point from each point on the boundary of
the shape and store all of those vectors in a table, called an
R table.
Then,
assuming translation but no scaling and no rotation, an
edge pixel contributes a vote to each parameter space bin
that is a displacement of that pixel by some vector in the R table.
If scaling and rotation are allowed, the R-table vectors contribute
votes for each orientation and scale considered.
With a shape of your own choosing, implement the generalized Hough
transform (at least for translation-only) -- for example, this
would be an alternative method for finding the "squiggle" shapes in
the game of Set.
- The Hough transform -- in action
As always, I'm interested in display tools for showing the
process behind visual algorithms, for example in a tool that shows the
sequential changes to parameter space -- say, as the pixels in
the edge image changed color to indicate that they have been
used.
- Keep in mind, too, that an extension need not be any of these things. In fact,
it need not have to relate to the Hough transform.
If you'd like, you can hand in as an extension some preliminary coding/results
on your final project. (See the next section for some project possibilities.)