PCA on an Image Set for Face Detection using Neural Networks

Hello, first of all let me introduce the problem:
As a University course project I'm trying to write a Face Detector using Neural Networks. The dataset is taken from the CBCL at MIT and consists of 2 separate datasets one for the training part and another one for the test part. Each part is composed of a face dataset ad a non-face dataset. Each image is a pgm (so a grayscale image) of 19x19 pixels. To create the training set, each image (approximately 2500 faces and 4500 non-faces) from the training database is transformed into a 1x361 array, equalized, masked with a mask that removes 40 background pixels and then added as a row in a 7000x322 matrix (i.e. the 361 original pixels - the 40 background pixels + 1 row for the class: 1 face, 0 non-face). Now, this matrix can be seen as dataset with 7000 observations of 321 attributes each. To reduce the dimensionality of the problem we thought to apply the PCA and reduce the attributes number to, say, 100-120 which account for the 97+ % of the variance.
Ok, now the questions:
I) In your opinion which Neural Network should I use? And how should I set the parametes (train function, perform function, etc...)? We thought of using a RBF or a MLP with 1 hidden layer and an output layer of 1 neuron, both with a tansig function.
II) To caluclate the PCA I use the command [c, s, l] = princomp(data) that automatically centers the dataset removing the mean value of the column in every column. Now, to train the NN should I use the first 100 components of the s matrix or the first 100 components of the (data*c) matrix? Of course if I do:
  • mmat = data - repmat(trainmean, 6977, 1); %trainmean is the mean of every column
  • mmat*c;
I re-obtain the s matrix.
III) When I test the trained NN I must not recalculate the PCA but I have to project the new data onto the PCA training space, right? But how should I do that?
  • Simply doing (testdata*c)?
  • Computing the columns mean of the testset and subtracting it form the testset? Which is testdata = testdata - repmat(testmean, 400, 1); testdata*c;
  • Subtracting form the testset the trainingset mean? testdata = testdata - repmat(trainmean, 400, 1); testdata*c;
  • If I have a single row, i.e. a single image to test, what should I do? For now I remove the trainingmean from the image vector.
Thank you for reading this. I hope someone can help.
Regards, Marco.

Answers (0)

Categories

Asked:

on 21 Oct 2012

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!