Alan Davidson, Mac Mason, Susanna Ricco, and Ben Tribelhorn
Weeks 8-9: Pictures and Movies
We began the second half of the semester by working on implementing
arrow detection and following ("Go that way, little robot!"). We also
wrote a program that plays Set as
an exercise in vision. Our code can be found at
setVideoTool.cpp For the game of set,
we had to define methods for finding color, shape, texture, and number
of cards. Finding color was simply finding the correct thresholds (one each
for red, green, and purple), and counting
the number of points that match each threshold. To
determine shape, we "draw" three horizontal lines across the top part
of the image and compare the slopes of the first intersection points with
the shape. For the number of shapes we look for a shape on specific
vertical lines. Given these same vertical lines, we look at a segment
on that line to determine texture.
We count the number of changes in color from the neighboring pixels
and the number of pixels of the expected color. If there are many changes,
then we know it is striped, if there are no changes (or too few) it is either
colored or not (which can be determined by the number of colored pixels).
Following arrows has been broken up into several stages:
Two videos of the robot approaching an arrow and turning to match it are
available in the week9 directory.
Newest versions of our tools can be found here: the client script and the
Weeks 10+: Pictures and Movies
- The first step, of course, is to decide "we're looking at an arrow!".
As of yet, we haven't done this. Most likely, it will be done by switching
into the "I see an arrow" state when more than a certain percentage of the
pixels on the screen are the "arrow" color. (See discussion of
- Once an arrow has been found, we need to figure out where it is, and
which direction it's pointing.
- Actually deciding that "this pixel is part of an arrow" and "this
pixel isn't part of an arrow" is done by opening the image, and then
closing it several times. At the moment, we're working on a black arrow on
the floor in the hall, which has a large number of black specks, making
the opening especially important. An example of our black-detection in
progress is available here.
- Once the arrow has been segmented out (oddly enough, by painting it
red and everything else black, even though the arrow is black, and
everything else isn't), we find the centroid of the arrow pixels. We use
the centroid in several ways:
- Centering: it's necessary that the arrow be directly in front of the
robot before we try to drive over it. With that in mind, we compare the
centroid's x-value to the center of the screen, and turn the robot a
smidge in the correct direction. Once
the centroid is within a threshold of the center, we consider it
- Next, we use the centroid to make sure the robot is close enough to
the arrow. If the arrow is too far away, the foreshortening causes it to
look wider than it is tall, messing up our direction-finding routine.
Code for this is in place, but the "close enough" treshold still needs
- Lastly, we use the centroid to help determine the direction the
arrow is pointing, which deserves its own item in the outer list...
- Once we've found the centroid, we need to figure out which way the
arrow is pointing. At the moment, we make the assumption that the arrow is
pointing along the line defined by the longest connected radius of arrow
pixels, starting at the centroid. This is by no means a particularly good
heuristic, but it gets pretty close, and is much easier to implement
(simply by trying 100 different radii) than any of our more-complicated
- Once the longest diameter has been found, we construct the
perpendicular bisector to that line and count the arrow pixels on both
sides. The "heavier" of the two sides gets to be the direction the arrow
is pointing. This means that lollipops, wedges, and other such things get
to have a direction, too. (Not always the direction you'd expect, either)
An example of an arrow, with direction and (part of) the longest radius
plotted, is here.
- By setting the camera to a known angle relative to the floor, we can
use the y-coordinate of the centroid (and some fancy trig involving the
arrow pixels) to figure out both how far away the arrow is pointing, and
in what direction. All that's left is to drive the robot that far forward
(remember that we've already centered), turn it that far, and repeat from
We have improved our arrow finding code to allow different colored arrows to
be followed. (Although we had a mishap of
following a stuffed dinosaur as if it were a red arrow).
We have added a dynamic set of ranges for different object colors.
During runtime the program now allows color ranges to be adjusted and specific
to the arrow following routine we can change the number of opening and closings we do.
However, the main functional change was in our client script. We added a new
state for our robot entitled Linear Wandering, so that when the robot hits objects
while following the arrows it will head straight rather than turn around and lose
its sense of directional purpose.
Object recognition has begun with many of the Wal-Mart objects linked to
We created color profiles for most of these objects at runtime (which are
saved to file). Much to our dismay, we learned that in the lower lighting
of the corridor, the colors orange, pink, and red are the same to our cameras.
Currently, we have assumed that arrow following implies an object at the end
of the line. So, what we do when we find a bunch of "accepted" pixels is to
check the number of pixels against a threshold. If it meets that criterion, then
we calculate the object's distance in front of the robot and draw a star to the
map. Next, we plan on using shape to
differentiate objects. Eventually, we would like to be generating the map, but
currently only have odometery to go by.
Finally, we have mounted a second camera on our robot. Unfortunately, we have yet
to find a programmatic way to select the camera we want to use. Instead, at runtime
we are asked which camera to use. Ben thinks that this is a limitation of the Intel
library we are using. The purpose of this camera is to be able to increase the view
of the robot because the arrow following camera is angled downwards and highly
calibrated. We would also like to make our second camera tilt, which would enable us
to look upwards for future identification of people (by tags, shirt color, hat, etc).
Here is the entry submission to the AAAI Conference
which we sent in on May 2nd.
The latest versions of our tools can be found here: the
client script and the VideoTool.