Alan Davidson, Mac Mason, Susanna Ricco, and Ben Tribelhorn

Weeks 8-9: Pictures and Movies

We began the second half of the semester by working on implementing arrow detection and following ("Go that way, little robot!"). We also wrote a program that plays Set as an exercise in vision. Our code can be found at setVideoTool.cpp For the game of set, we had to define methods for finding color, shape, texture, and number of cards. Finding color was simply finding the correct thresholds (one each for red, green, and purple), and counting the number of points that match each threshold. To determine shape, we "draw" three horizontal lines across the top part of the image and compare the slopes of the first intersection points with the shape. For the number of shapes we look for a shape on specific vertical lines. Given these same vertical lines, we look at a segment on that line to determine texture. We count the number of changes in color from the neighboring pixels and the number of pixels of the expected color. If there are many changes, then we know it is striped, if there are no changes (or too few) it is either colored or not (which can be determined by the number of colored pixels).

Following arrows has been broken up into several stages:

  1. The first step, of course, is to decide "we're looking at an arrow!". As of yet, we haven't done this. Most likely, it will be done by switching into the "I see an arrow" state when more than a certain percentage of the pixels on the screen are the "arrow" color. (See discussion of segmentation below)
  2. Once an arrow has been found, we need to figure out where it is, and which direction it's pointing.
  3. Actually deciding that "this pixel is part of an arrow" and "this pixel isn't part of an arrow" is done by opening the image, and then closing it several times. At the moment, we're working on a black arrow on the floor in the hall, which has a large number of black specks, making the opening especially important. An example of our black-detection in progress is available here.
  4. Once the arrow has been segmented out (oddly enough, by painting it red and everything else black, even though the arrow is black, and everything else isn't), we find the centroid of the arrow pixels. We use the centroid in several ways:
    1. Centering: it's necessary that the arrow be directly in front of the robot before we try to drive over it. With that in mind, we compare the centroid's x-value to the center of the screen, and turn the robot a smidge in the correct direction. Once the centroid is within a threshold of the center, we consider it centered.
    2. Next, we use the centroid to make sure the robot is close enough to the arrow. If the arrow is too far away, the foreshortening causes it to look wider than it is tall, messing up our direction-finding routine. Code for this is in place, but the "close enough" treshold still needs work.
    3. Lastly, we use the centroid to help determine the direction the arrow is pointing, which deserves its own item in the outer list...
  5. Once we've found the centroid, we need to figure out which way the arrow is pointing. At the moment, we make the assumption that the arrow is pointing along the line defined by the longest connected radius of arrow pixels, starting at the centroid. This is by no means a particularly good heuristic, but it gets pretty close, and is much easier to implement (simply by trying 100 different radii) than any of our more-complicated ideas.
  6. Once the longest diameter has been found, we construct the perpendicular bisector to that line and count the arrow pixels on both sides. The "heavier" of the two sides gets to be the direction the arrow is pointing. This means that lollipops, wedges, and other such things get to have a direction, too. (Not always the direction you'd expect, either) An example of an arrow, with direction and (part of) the longest radius plotted, is here.
  7. By setting the camera to a known angle relative to the floor, we can use the y-coordinate of the centroid (and some fancy trig involving the arrow pixels) to figure out both how far away the arrow is pointing, and in what direction. All that's left is to drive the robot that far forward (remember that we've already centered), turn it that far, and repeat from the beginning.
Two videos of the robot approaching an arrow and turning to match it are available in the week9 directory. Newest versions of our tools can be found here: the client script and the videotool.

Weeks 10+: Pictures and Movies

We have improved our arrow finding code to allow different colored arrows to be followed. (Although we had a mishap of following a stuffed dinosaur as if it were a red arrow). We have added a dynamic set of ranges for different object colors. During runtime the program now allows color ranges to be adjusted and specific to the arrow following routine we can change the number of opening and closings we do. However, the main functional change was in our client script. We added a new state for our robot entitled Linear Wandering, so that when the robot hits objects while following the arrows it will head straight rather than turn around and lose its sense of directional purpose.

Object recognition has begun with many of the Wal-Mart objects linked to here. We created color profiles for most of these objects at runtime (which are saved to file). Much to our dismay, we learned that in the lower lighting of the corridor, the colors orange, pink, and red are the same to our cameras. Currently, we have assumed that arrow following implies an object at the end of the line. So, what we do when we find a bunch of "accepted" pixels is to check the number of pixels against a threshold. If it meets that criterion, then we calculate the object's distance in front of the robot and draw a star to the map. Next, we plan on using shape to differentiate objects. Eventually, we would like to be generating the map, but currently only have odometery to go by.

Finally, we have mounted a second camera on our robot. Unfortunately, we have yet to find a programmatic way to select the camera we want to use. Instead, at runtime we are asked which camera to use. Ben thinks that this is a limitation of the Intel library we are using. The purpose of this camera is to be able to increase the view of the robot because the arrow following camera is angled downwards and highly calibrated. We would also like to make our second camera tilt, which would enable us to look upwards for future identification of people (by tags, shirt color, hat, etc).

Here is the entry submission to the AAAI Conference which we sent in on May 2nd.
The latest versions of our tools can be found here: the client script and the VideoTool.