One way to detect objects in an image is to threshold the pixels in the image based on their intensity of color values. Another approach is to group or cluster the pixels based on some affinity measure between them. In this part of the assignment you will explore segmentation via clustering and compare it to the threshold segmentation you did in the last section.
function [labels,means] = my_kmeans(data,k)where data is an n-by-p matrix containing n data points each with p dimensions and k is the desired number of clusters. labels is an n-by-1 matrix containing the final point labels, and means is a k-by-p matrix of cluster centers.
The purpose of having you implement this algorithm is to give you a little more practice with the matrix manipulation functions in matlab. Matlab has been optimized for matrix manipulation, but loops are very slow. Try to avoid using loops in your function (you may need to use one or two, but minimize their use). Instead, use matrix manipulation. For example, if you wanted to select all of the points which have been labeled as "1", instead of writing a for loop you could write:
>> found = find(labels == 1); >> currdata = data(found,:);The find command returns the indices of the elements of labels that match the boolean expression (in this case == 1), and then data(found,:) select those rows whose indices are listed in the vector found.
Test your my_kmeans function on the data in pts.mat using k=2. Use the provided plot3dclusters function to visualize your results. You may also want to compare your results to the built-in kmeans matlab function.
Include a link to your code and an image of your results on the test data:
Now you will use your kmeans function to segment various images. Implement the function:
function segIm = kmeansSegment(im,k);where im is the image (or image feature matrix, see below), k is the number of desired segments. The output image should allow the user to visualize the segmented portions of the image by assigning a different color value to each label and coloring the points in the image accordingly (as is done on p. 316 in Forsyth & Ponce, only you should use colors).
Run your segmentation algorithm on a number of the provided images. Try a number of different values for k. For now, limit your tests to grayscale images. We will deal with color below.
Include a link to your code and illustrate the range of your results as compactly as possible. You probably want to choose two images, at least one of them "natural", for which you will show results. Include results for a few different values of k.
In addition, include a short (1 paragraph) discussion of how the choice for k affects the output. Suggest a few ways in which you might choose k automatically (you do not have to implement them). Speculate on how well you think they would work.
Run your cluster-based segmentation algorithm on either RGB or HSI images (whichever you feel will work better). Note that you will need to write code to "flatten" your images into n-by-3 matrices, where n is the number of points in the image. You will show your results below, after the next task.
When segmentation is run on the RGB and HSI images, essentially what is happening is that each color channel becomes a "feature" of the image point. There is no reason why we must limit the segmentation algorithm to 3 features. We can add additional features simply by extending each point's feature vector to produce an n-by-p matrix where p is the total number of features.
One problem we have seen is that so far our clustering algorithm doesn't include the point's relative positions, so it tends to group points together even when they are far apart spatially. Fix this problem by writing code that extends the feature vector passed into the clustering algorithm by adding a point's x and y positions.
Include examples of your color and color+position results below. You should also include a discussion (about 2 paragraphs) of how cluster-based segmentation compares to the thresholding you did in the last section. What are the advantages and disadvantages of each? How do the results compare? What effect does adding position information have on cluster-based segmentation?
Remember that you need to pursue only one extension in one part of the assignment, and it can be anything you'd like to investigate. These are only suggestions.