# Image Classification: Color Histogram & KNN classifer

17 views (last 30 days)
Dennis Tran on 3 May 2015
Commented: Nalini Vishnoi on 5 May 2015
Hello,
I want to classify an image through the use of color histograms and knn classifer. I have a dataset of 100 images for each class (butterfly, dog, cat) in a folder. My understanding of the problem is as follows:
1) read in images to create a color histogram for each (RGB)
2) find the kmeans for RGB for each image
3) cluster the kmeans points separately for each class and find its centroid (so for the butterfly class, each image gives me kmeans value for R, G, B. I plot all the R kmeans values and find the centroid, same with G and B.)
4) Read in test image, create a color histogram, find the kmeans value for RGB, then use the Euclidean distance for each kmeans to find the nearest cluster for R,G,B.
Is this how it is supposed to be done or am I not understanding this correctly?

Nalini Vishnoi on 4 May 2015
Hi Dennis,
The steps in your algorithm seem correct. However, when you are doing k-means clustering a lot of information is lost and for practical purposes, color histograms may not be strong enough to discriminate various classes (it would be heavily dependent on your data set). It might be useful to consider adding additional features, for example: texture, shape etc. These features combined together would capture unique information about the classes that need to be distinguished from each other.
##### 2 CommentsShowHide 1 older comment
Nalini Vishnoi on 5 May 2015
If you are just finding the mean value, the algorithms is NN/1-NN (nearest neighbor) rather than K-NN. The mean and the centroid should be the same. You may find this link useful.

Image Analyst on 5 May 2015
There is no way that will correctly classify the animals UNLESS all your cats are the "same" color, all the dogs are the "same" color, and all the butterflies are the same color, and there is little other clutter in the background. If you assumed all your cats were black, and all your butterflies were orange and black monarchs, and you presented an orange/ginger tabby cat, your algorithm might say the cat was a butterfly.
Dennis Tran on 5 May 2015
Edited: Dennis Tran on 5 May 2015
Yes, I understand that the color histogram isn't the only feature I should have. I understand that let say a checkers board will compare exactly to a board half black and half red, but I just want to be able to use this feature with KNN as a start. I will be using a training data set of 80images and a test data set of 20 images for each category. I will hopefully choose the training set to have the edge boundaries.
I have currently read in my images and got rgbhistogram with 8bins for each channel giving me 512bins. This gives me an matrix of Nx512 where N is the number of images read.
Now I am stuck on how to use the knn classify function in matlab.
I know I need to compare the query image to my dataset using the Euclidean distance, sort and use the smallest distance as my answer, but how do I grab the name category specific to the data with the smallest distance? I haven't stored the class name with the histogram data.

### Categories

Find more on Nearest Neighbors in Help Center and File Exchange

### Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!