Computer Science 154
Short Assignment 8


Visual Tracking

Ryan Gibson, Paul Ruvolo, Conor Sen


Stand-Alone Tracking Code: track.cc
Nomad-Linked Tracking Code: nomadTrack.cc

The first problem we encountered was that, given the RGB values of a certain pixel, we needed to find an accurate means of defining the hue of this pixel. Because the frisbee is red, it's hue values should be somewhere near 0 or 360. Thus, by calculating the hue for each pixel in the image, we have a means of isolating red pixels. After a bit of searching on Google, we came across this set of code that calculates a hue value from a set of RGB values.

Once we had this piece of code, we began with a naive algorithm that approximated the center of frisbee by taking the average x and y values of all the pixels that had hue values near 0 (greater than 357 or less than 3). However, this algorithm clearly runs into trouble when the smaller frisbee (the distractor) is introduced into the image. The red pixels in the distractor tend to pull the approximation in the direction of the distractor.

Next, we turned our base algorithm into a tracking algorithm by tracking the frisbee's location from frame to frame. Because the frisbee moves a finite distance frome one frame to the next, we decided that we could safely ignore any red pixels that were outside a given radius of the frisbee's last location. For the most part, this prevented the distractor from pulling the target away from the frisbee.

However, when the distractor is placed in close proximity to the frisbee, the tracking algorithm fails and in some cases identifies the distractor as the frisbee. To take care of this, we weighted the red pixels using their distance from the middle of the frisbee's previous location. Thus, red pixels relatively far away from the frisbee's last location have less effect on the new target approximation.

Our final improvement consisted of looking at grids of pixels and taking the center to be the pixel with the best score for its grid. For instance, starting at pixel (0,0), look at all pixels from (0,0) to (cellSize, cellSize) and calculate its score. Next, look at all pixels from (0,1) to (cellSize, cellSize+1), and so on. Save the max score, and when we're done, use that max score and its location to put the cross on our calculated center.

All in all, we think our algorithm is fairly effective. It definitely tracks the frisbee from frame to frame without getting too distracted by the smaller frisbee.