Harvey Mudd College
Computer Science 182a
Project 1
Due Sunday, January 30, by 11:59pm
"If only Minsky had been at Mudd..."
Goals
This assignment simply seeks to familiarize you with the
environments in which you may choose to work: OpenCV (a C++ and C based
library that is the most commonly used for real-time vision
applications) and Matlab (everyone's -- well, at least the computer
vision community's -- favorite mathematical and prototyping tool).
Another option is the Python Imaging Library, PIL.
Just to keep it interesting, however, the context for getting
into these systems will be to solve a crucial piece of the vision problem,
, i.e.,
to take in arbitrary images and indicate where the spam
is located in that image. Don't worry -- your program will not have to work perfectly
to receive full credit -- but it will have to work at least sometimes on images
that it has not "seen before." You'll also implement perhaps the very first widely-used
vision algorithms, greenscreening.
Getting started with Matlab
If you choose matlab -- the language of choice for the vast majority
of computer vision researchers -- you may want to
head to this Matlab link for a quick
review of matlab and introducton to its image-processing support,
which is quite substantial. Another link that is worth reviewing
is
Svetlana Lazebnik's Matlab intro, especially sections 5 and 6.
Want to use OpenCV?
You're in luck! The OpenCV library is in MUCH better shape than in the past,
thanks to the remarkable company, Willow Garage.
The OpenCV wiki
has installation instructions and documentation.
I have a starter project for the Mac: linked here.
The video clip spam.mp4 is also here.
This will work if you unzip it on one of the lab macs, double click on the
xcodeproj file and then choose build and run. It has a
simple color-thresholding system at the moment; mostly, it's meant to get you
started. I have not ported this to windows, but I suspect it would not be too bad.
In fact, for windows, the easist way to get started is with the VS2010 version,
which comes with all of the libraries already built.
Want to use the Python Imaging Library?
This is OK, too! It's installed on the CS machines and not
too bad to install on your own. You can grab it by Googling for PIL.
The BioCS7 course in the Spring of 2010 used PIL -- you might start with this
links (and certainly the online documentation):
Introduction via BioCS7's wiki site problem.
The challenge
- First, either get Matlab or OpenCV or PIL working to the point
at which you can load, display, and modify images.
- Next, choose an image -- perhaps one of the spam images available on the CS
machines (knuth) at /cs/cs153/Images/spamImages2011 --
or another image of your own choosing.
Write small functions/programs that alter that image in the following ways:
-
Invert all of the pixels in (R,G,B) space, i.e., change the value of
each of those three channels for each pixel in the image from x
to 255-x. Save the resuling images for inclusion in your write-up.
-
Convert your image to (H,S,V) space and then create versions of the
"Hue" image, the "Saturation" image, and the "Value" image
from your original. Each of these three images should be greyscale and should
represent the hue, saturation, and value quantity of each pixel as an
intensity from 0 (min) to 255 (max). There are different choices of
what the minimum hue is -- any choice is OK, as long as you explain it
in your web/wiki page. Save these three images for your write-up.
-
Create a three-band color image such that the three color bands are
some unusual combination of R, G, B, H, S, and V. Ideally,
someone who could see the original image could figure out what you
had put in each band -- but not necessarily easily!
For example, you could use 255-R in the red channel, S in the
green channel, and B in the blue channel (no change).
Put this new, composite image in your write-up (and the "solution"
of how it was created!)
-
Greenscreening Next, write a function that takes
in an image and "greenscreens" it. That is, it removes all of the green
pixels by changing them all black or all white (or all of another uniform
non-green color). Through this process, you will have segmented
the figure (or object) away from the background (the green screen).
(The greenscreen images are located in
/cs/cs153/Images/greenscreen2011.)
Put the result into your write-up.
-
Finding spam Last, but not least,
write a script/function/program
that identifies the spam in an image. To show this, your program should
draw a plus at the center of the word "SPAM" --
in addition, your program should print out the location
of that point to the console.
For this problem, you may
assume the image contains a can of spam, in a roughly frontal view.
Thus, your algorithm can choose the "most spammy" spot (even if there
is no spam).
Gather at least two images in which your algorithm worked, and two
in which it did not work to post to your write-up.
Also, show the results on two images (of your choice) where there
was no spam to be found, i.e., what does your spam-finder find spammy about
those images... ?
One note: your program should not simply guess where
the spam is! It should run a deterministic algorithm that uses the
values of the input image's pixels to locate the spam... .
Deliverables
- Write-up
By Sunday night (at 11:59 pm), you should create a web or wiki page with at least the
various images
mentioned in the description above.
In addition, include a write-up of a paragraph or so explaining your
spam-finding algorithm and green-screening algorithm. Note when your approaches
work well -- and when they don't.
Also, include a suggestion about how your spam-finder could be improved.
Link your write-up to the
CS 182 wiki page - let me know if you run into permission problems.
Please do include a zip file of your source files/code along with
your write up. (Only the source - avoid libraries or executables.)
You may use the CS wiki to post your projects, but other pages are all right, too.
Here are some sites that worked well (and whose projects worked well) in the past:
You may certainly refer to earlier projects, both in robotics and in vision, for inspiration and ideas.
In fact, if you choose to use a robot that students have used before, I strongly encourage you to
read the write-ups of prior year's teams: you will definitely want to build on their progress.
However, the vision code that you submit for particular problems, such as the Spam-finding and green-screening
and the like, should be entirely your own. Here is the page linking to all prior years' projects:
student projects page.
Possible extentions
The creative-extension part of this assignment can be in many forms -- feel free to
check with me if you're not sure.
- You might show visualizations of some of the intermediate processing that
led to your algorithm's final result.
- Add the capability of inserting the pixels from
another image as background for your green-screened image.
Put the composited image
(with the background taken from another picture) into your write-up.
Other variations are welcome.
- Create and explan a method that detects spam, alongside
the location mechanism described above. That is, your system will
first determine whether there is a can of spam in the image, and, only
if it decides there is one, will it mark its center.
- Determining the bounding box for the can of spam: drawing a box, rather than
a plus sign at the center. You may decide how much of the spam you want to
enclose in the box -- but you need to decide that before declaring success!
- Choose a robot platform and get it moving and sensing. Include a short video
of the robot moving around and evidence of the sensor data you can access, e.g.,
from a screenshot or image or an attached file... .
- If it was tricky to do, create a wiki page describing how you got OpenCV working --
especially if you got it working with Python --
on your system! Feel free to link to other instructions and include
notes on any changes you had to make for your system.
This is likely to be a helpful resource to other folks!
- Other ideas welcome :-)