Harvey Mudd College
Computer Science 153
Problem Set 4
Due Friday, Nov 11 11:59pm

Back to Problem Set 4 top-level page

Back to Section 1: Hough Transform

Section 2: Sketch Recognition through Matching

In this section you will write a program that parses simple sketches using relational model matching. Your tasks in this section will be to define the appropriate models for your shapes, detect low-level symbols and relations in the image, and then match the models to the image using either an interpretation tree or discrete relaxation.

Part 1: Modeling your Shapes

Your task is to recognize the shapes in the family tree diagrams you drew in class. That is, you need to be able to represent and detect squares, triangles, arrows and lines. We will do this using relational models.

Your first job is to write a relational model for each of the shapes listed above. Note that your low level shapes are always lines, but that you may need to break those lines down into their lower level components to create accurate shape models (hint: think about the difference between an arrow and a (possibly non-equilateral) triangle).

Do your best to model your shapes before you start trying to recognize diagrams, but you may find you need to change your models based on preliminary recognition results. That's fine. Include in your writeup a a textual description of your final shape models as well as a brief (1 paragraph) description of why you chose to model your shapes as you did.


Results:


Part 2: Detecting parts and relations

Once you have defined your models, you need to detect the low-level shapes and relations in your sketches. Luckily, we are dealing with strokes, so a lot of the ambiguity of edge-finding is removed.

Throughout this whole section of the problem set we are going to use a stroke-based, rather than a pixel-based, representation of images. The file format we will use is .drs and stroke files can be found in:

/cs/cs153/ImagesF05/PS4/drs
Again, you will find simple shapes as well as family trees.

To load stroke-based images into matlab, you have to use this loadStrokes function. This will produce a cell array of strokes, where each stroke is an Nx3 matrix, where N is the number of points in the stroke, and the columns represent the x, y and timestamp for each point in the stroke. (For everything in the required part of this assignment you can ignore the timestamp information). You can view these strokes using this drawStrokes function. It will draw each stroke in a different color to help you see where the boundaries of each stroke are.

Note that some of the data is particularly messy (i.e. people didn't follow the instructions about drawing each line with a single stroke. You should run your algorithm on the cleaner images first. You are not required to handle images that violate our drawing constraints.

Once you have strokes loaded, your task is to detect the lines and the relations between these lines as you defined them above. You should first write a matlab function that turns a stroke into a line. Luckily, since all strokes are (supposedly) lines, turning strokes into lines is simple! You can simply take the first and last point of each stroke and call it a line. You may also choose to do something more complicated if you wish, if you think that will give you better performance (but I would start with this simple approach). I recommend representing a line as a 2x2 matrix, where each row holds one of the endpoints of the line. But you do not have to do it this way.

Next, you should write one relation detecting function for each of the relations (constraints) you specified above. These functions should be boolean, and should take as input one or more of the parts that you specified in your model. For example, you might have a relation parallel(line1, line2). The matlab function to detect this relation would look something like:

linesAreParallel = parallel(line1, line2);
where lines are represented however you choose to represent them above.

Remember that your relations need not be between only lines. Some may be between points, for example.

Write and test all your relation detecting functions before going on. Include the code for detecting these relations below.

Model Matching

In this part you will write a function that matches a set of parts to a given model either by constructing an interpretation tree or by discrete relaxation. Your function should return the best labeling of the given parts, or an empty list if there is no consistent labeling of the given parts using the given model.

So far we haven't talked about how you will represent the model in matlab. That is up to you, but one suggestion is to use strings (or numbers or whatever) to represent each label and each type of relation. Your relations, then, will somehow connect labels through relations, perhaps by putting them into a list. For example, this might be your model for an equals sign:

labels = {'l1', 'l2'};
relations = {{'parallel' 'l1' 'l2'} {'equal-length' 'l1' 'l2'}};
Note that I have kept the labels and the relations separate. This is fine. Note also that I am using cell arrays because strings are variable length.

This is far from the only way you can represent your models. In fact, I can think of a number of improvements that can be made for efficiency. For example, you might want to index relations by label so that you can easily find all of the relations that pertain to a particular label.

To test your algorithm, try running it on the files in the simpleShapes directory. Try matching not only the correct model, but also incorrect models (on subsets of the strokes, if there are too many strokes in the image). E.g. try to match an arrow model to a picture of a triangle.

You should also run it on shapes in a few of the family trees. This presents a difficulty because you have not implemented any segmentation. What you should do is simply pick out some strokes from these images that either do or do not form shapes and run your matching function on these hand-selected strokes. For example, you might know that the 4th through 7th stroke in on of the images is a square. You could select these strokes (in Matlab) and then try matching them to a square model. In addition, you might select the 3rd through 6th stroke and try matching those to a square template (it should fail).

Displaying Your Results

The easiest way to show that you have labeled the parts correctly is to visualize your recognized shapes in place of your strokes. You should use the Matlab functions line, fill and drawArrow to display the shapes you recognize. You will probably want to show the fit on top of a figure that draws the strokes (you can use drawStrokes for this, but you will probably want to modify it so that it doesn't alternate colors for the different strokes).

Show results on several images, both the simple shapes and the shapes extracted (by hand) from the family trees. Show a few successful runs and some not-so successful ones. Also include a description of what worked well and what caused problems for your algorithm. Finally, you should also include links to all of the code you used in this section.


Results:


Optional (but HIGHLY ENCOURAGED) Extension: Segmentation

Address the problem of segmentation. Your task in this part is to write a matlab program that simultaneously segments and recognizes the strokes in the image. Here are some notes that will help you with this task:

Show results on a number of family tree diagrams.


Possible Extensions

Remember that you need to pursue only one extension in one part of the assignment, and it can be anything you'd like to investigate. These are only suggestions.

If you do an extension in this section, I really want you to do the one proposed above, but if you really don't want to do that one, then here are a couple more suggestions.

Return to PS4 Overview