CS182a Assignment 1

Harvey Mudd College
Computer Science 182a
Project 1
Due Sunday, January 30, by 11:59pm

"If only Minsky had been at Mudd..."

Goals

This assignment simply seeks to familiarize you with the environments in which you may choose to work: OpenCV (a C++ and C based library that is the most commonly used for real-time vision applications) and Matlab (everyone's -- well, at least the computer vision community's -- favorite mathematical and prototyping tool). Another option is the Python Imaging Library, PIL.

Just to keep it interesting, however, the context for getting into these systems will be to solve a crucial piece of the vision problem, , i.e., to take in arbitrary images and indicate where the spam is located in that image. Don't worry -- your program will not have to work perfectly to receive full credit -- but it will have to work at least sometimes on images that it has not "seen before." You'll also implement perhaps the very first widely-used vision algorithms, greenscreening.

Getting started with Matlab

If you choose matlab -- the language of choice for the vast majority of computer vision researchers -- you may want to head to this Matlab link for a quick review of matlab and introducton to its image-processing support, which is quite substantial. Another link that is worth reviewing is Svetlana Lazebnik's Matlab intro, especially sections 5 and 6.

Want to use OpenCV?

You're in luck! The OpenCV library is in MUCH better shape than in the past, thanks to the remarkable company, Willow Garage. The OpenCV wiki has installation instructions and documentation.

I have a starter project for the Mac: linked here.
The video clip spam.mp4 is also here.

This will work if you unzip it on one of the lab macs, double click on the xcodeproj file and then choose build and run. It has a simple color-thresholding system at the moment; mostly, it's meant to get you started. I have not ported this to windows, but I suspect it would not be too bad. In fact, for windows, the easist way to get started is with the VS2010 version, which comes with all of the libraries already built.

Want to use the Python Imaging Library?

This is OK, too! It's installed on the CS machines and not too bad to install on your own. You can grab it by Googling for PIL. The BioCS7 course in the Spring of 2010 used PIL -- you might start with this links (and certainly the online documentation): Introduction via BioCS7's wiki site problem.

The challenge

First, either get Matlab or OpenCV or PIL working to the point at which you can load, display, and modify images.

Next, choose an image -- perhaps one of the spam images available on the CS machines (knuth) at /cs/cs153/Images/spamImages2011 -- or another image of your own choosing.

Write small functions/programs that alter that image in the following ways:
- Invert all of the pixels in (R,G,B) space, i.e., change the value of each of those three channels for each pixel in the image from x to 255-x. Save the resuling images for inclusion in your write-up.
- Convert your image to (H,S,V) space and then create versions of the "Hue" image, the "Saturation" image, and the "Value" image from your original. Each of these three images should be greyscale and should represent the hue, saturation, and value quantity of each pixel as an intensity from 0 (min) to 255 (max). There are different choices of what the minimum hue is -- any choice is OK, as long as you explain it in your web/wiki page. Save these three images for your write-up.
- Create a three-band color image such that the three color bands are some unusual combination of R, G, B, H, S, and V. Ideally, someone who could see the original image could figure out what you had put in each band -- but not necessarily easily! For example, you could use 255-R in the red channel, S in the green channel, and B in the blue channel (no change). Put this new, composite image in your write-up (and the "solution" of how it was created!)
- Greenscreening Next, write a function that takes in an image and "greenscreens" it. That is, it removes all of the green pixels by changing them all black or all white (or all of another uniform non-green color). Through this process, you will have segmented the figure (or object) away from the background (the green screen). (The greenscreen images are located in /cs/cs153/Images/greenscreen2011.) Put the result into your write-up.
- Finding spam Last, but not least, write a script/function/program that identifies the spam in an image. To show this, your program should draw a plus at the center of the word "SPAM" -- in addition, your program should print out the location of that point to the console.
  
  For this problem, you may assume the image contains a can of spam, in a roughly frontal view. Thus, your algorithm can choose the "most spammy" spot (even if there is no spam). Gather at least two images in which your algorithm worked, and two in which it did not work to post to your write-up. Also, show the results on two images (of your choice) where there was no spam to be found, i.e., what does your spam-finder find spammy about those images... ?
  
  One note: your program should not simply guess where the spam is! It should run a deterministic algorithm that uses the values of the input image's pixels to locate the spam... .

Deliverables

Write-up By Sunday night (at 11:59 pm), you should create a web or wiki page with at least the various images mentioned in the description above.

In addition, include a write-up of a paragraph or so explaining your spam-finding algorithm and green-screening algorithm. Note when your approaches work well -- and when they don't. Also, include a suggestion about how your spam-finder could be improved.

Link your write-up to the CS 182 wiki page - let me know if you run into permission problems. Please do include a zip file of your source files/code along with your write up. (Only the source - avoid libraries or executables.)

You may use the CS wiki to post your projects, but other pages are all right, too. Here are some sites that worked well (and whose projects worked well) in the past:
You may certainly refer to earlier projects, both in robotics and in vision, for inspiration and ideas. In fact, if you choose to use a robot that students have used before, I strongly encourage you to read the write-ups of prior year's teams: you will definitely want to build on their progress. However, the vision code that you submit for particular problems, such as the Spam-finding and green-screening and the like, should be entirely your own. Here is the page linking to all prior years' projects:
student projects page.

Possible extentions

The creative-extension part of this assignment can be in many forms -- feel free to check with me if you're not sure.

You might show visualizations of some of the intermediate processing that led to your algorithm's final result.
Add the capability of inserting the pixels from another image as background for your green-screened image. Put the composited image (with the background taken from another picture) into your write-up. Other variations are welcome.
Create and explan a method that detects spam, alongside the location mechanism described above. That is, your system will first determine whether there is a can of spam in the image, and, only if it decides there is one, will it mark its center.
Determining the bounding box for the can of spam: drawing a box, rather than a plus sign at the center. You may decide how much of the spam you want to enclose in the box -- but you need to decide that before declaring success!
Choose a robot platform and get it moving and sensing. Include a short video of the robot moving around and evidence of the sensor data you can access, e.g., from a screenshot or image or an attached file... .
If it was tricky to do, create a wiki page describing how you got OpenCV working -- especially if you got it working with Python -- on your system! Feel free to link to other instructions and include notes on any changes you had to make for your system. This is likely to be a helpful resource to other folks!
Other ideas welcome :-)

Harvey Mudd College Computer Science 182a Project 1 Due Sunday, January 30, by 11:59pm