What could be the reason why my model does not give accurate results as I planned?
    9 views (last 30 days)
  
       Show older comments
    
Hi everyone. First of all, thank you for your time. This will be my first question on the matlab platform. Please excuse me if I have any mistakes. If you understand the problem, you can already find the necessary files in the zip folder. If you want to view images in pgm format, you can use the GIMP application.
I am planning to design a MLP image processing model without using any toolbox. 
I plan to train my model by reading one by one 32x30 scale images in the CMU face images dataset I obtained from the internet and then continue with testing process.
(I use imread function that is provided by MATLAB)
INPUT is a cell vector which contains image matrixes in each element. So each element represents an image actually. While processing samples one by one I get its images as column vector.
Here is the code for file operations and image reading:
clc;clear;close;
%*****************Reading Images**************
myFolder = ''; %% Images folder path
if ~isfolder(myFolder) %% Checking if the folder doesn't exist
    errorMessage = sprintf('Error: The following folder does not exist:\n%s\nPlease specify a new folder.', myFolder);
    uiwait(warndlg(errorMessage));
    myFolder = uigetdir(); % Ask for a new one.
    if myFolder == 0
         % User clicked Cancel
         return;
    end
end
filePattern = fullfile(myFolder, '*.pgm');
theFiles = dir(filePattern);
% Define the number of image files in the folder
numImages = length(theFiles);
% Initialize a cell array to store the images
INPUT = cell(numImages, 1);
for k = 1 : length(theFiles)
    baseFileName = theFiles(k).name;
    fullFileName = fullfile(theFiles(k).folder, baseFileName);
    INPUT{k} = imread(fullFileName);
    imshow(INPUT{k});  % Display image.
    drawnow; % Force display to update immediately.
end
%***********************************************
It is a user-interactive model, and firstly I get the number of:
Hidden layers:
Neurons in each hidden layer:  (neuron numbers will be the same for each hidden layer)
Max iteration:
of my model from user.
Other definitions are shown in below code. I define NumberOfInput as 960 which comes from 32x30 because Weight matrix's size between input layer and first hidden layer needed to adjusted in that way. 
Weight matrix values are assigned randomly.
My model should return 0 if person doesn't wear sunglasses and 1 if person wears with high accuracy. So it is scaler and there will be 1 output obviously.
I studied about MLP models and I found that finding perfect variables is a hard subject in machine learning and it depends on application and tests. So I defined my ETA with various values: 0.01,0.02,0.05,0.2,0.5....
In this type of model input to neurons are defined as netH and output of these neurons are defined as H except last connections. In there they become netO and O.
Also sigma size is defined in order to use after forward state (backward state starts). 
My model is an example of Supervised Learning and it needs some outputs for training images like mentioned DESIRED as below. Images inside Model_Training are located as open-->sunglasses-->open-->sunglasses... so I decided to define desired with this order as shown below.
Here is the code:
%*******************VARİABLES*******************
NumberOfPatterns=numImages;
NumberOfInput=960;
NumberOfOutput=1;
LearningRate_ETA=0.5;
while true
    NofLayers=input("Layer number: "); % Hidden layer number
    Nofneurons=input("Neuron number: "); % Neuron number of each hidden layer
    Max_iteration=input("Max iteration number: "); % Max iteration
    if(Nofneurons<=0 || NofLayers<=0 || Max_iteration<=0)
        fprintf("These values can't be accepted !");
        fprintf("\nPlease enter again");
    else
        break;
    end
end
W = cell(NofLayers+1,1);
H=cell(NofLayers,1);
sigma=cell(NofLayers+1,1);
%***********************************************
% Random values are assigned to Weights
for i=1:NofLayers+1
    if i==1
        W{i}=rand(NumberOfInput,Nofneurons);
    elseif i==NofLayers+1
        W{i}=rand(Nofneurons,NumberOfOutput);
    else
        W{i}=rand(Nofneurons,Nofneurons);
    end
end
%***********************************************
DESIRED=zeros(NumberOfPatterns,1);
%****************Adjusting Desired Results******
%Training images are located in order. Ex: 
%A_open.pgn
%A_sunglasses.pgn
for i=1:NumberOfPatterns
    if(mod(i,2)==1)
        DESIRED(i)=0;
    else
        DESIRED(i)=1;
    end
end
%************************************************
Right now processing starts. I need to mention that I use sigmoid function as activation function.
In order not to prolong the topic further I will share directly code in here.
%***********************Processing***************
for a=1:Max_iteration
    totalerr=0;
    for i = 1:NumberOfPatterns
        ImageVector = reshape(INPUT{i}, [], 1);
        X = double(ImageVector);
        for lay=1:NofLayers+1
            if(lay==1) %First connections
                netH=W{lay}'*X;
                H{lay}=sigmoid(netH);%%%
            elseif (lay==NofLayers+1) %Last connections
                netO=W{lay}'*H{lay-1};
                O=sigmoid(netO);
            else % between connections layers
                netH=W{lay}'*H{lay-1}; %
                H{lay}=sigmoid(netH);%
            end
        end
        err=DESIRED(i)-O;
        for j=1:NumberOfOutput
                sigma{NofLayers+1}=err*O(j)*(1-O(j));  %Last sigma value
        end
        for l=1:NofLayers
            for k=1:Nofneurons
                [rowsigma colsigma]=size(sigma{NofLayers-l+2});
                [rowW colsW]=size(W{NofLayers-l+2}(k,:));
                %These conditions satisfies proper matrix multiplciation
                if(colsigma==rowW)
                    sigma{NofLayers-l+1}=sigma{NofLayers-l+2}*W{NofLayers-l+2}(k,:) *H{NofLayers+1-l}(k)*(1-H{NofLayers+1-l}(k));
                else
                    sigma{NofLayers-l+1}=sigma{NofLayers-l+2}*W{NofLayers-l+2}(k,:)'*H{NofLayers+1-l}(k)*(1-H{NofLayers+1-l}(k));
                end
            end
        end
        for z=1:NofLayers+1
            %Weights are updated at this part
            if((NofLayers+2-z)==1)
                W{NofLayers+2-z}=W{NofLayers+2-z}+LearningRate_ETA*X*sigma{NofLayers+2-z};
            else
                W{NofLayers+2-z}=W{NofLayers+2-z}+LearningRate_ETA*H{NofLayers+1-z}*sigma{NofLayers+2-z};
            end
        end
        totalerr=totalerr+0.5*err^2;
    end
    cost(a)=totalerr;
end
plot(cost);
%%*****************Test Case********************
%Getting test image address from user
fileFilter = '*.pgm';
[filename, pathname] = uigetfile(fileFilter, 'Select a PGM file', '');
if isequal(filename, 0)
    disp('Program has stopped');
else
    fullFilePath = fullfile(pathname, filename);
end
%**************Test Sample Operations*******
testSample=imread(fullFilePath); %
testSample=reshape(testSample,[],1);
X=double(testSample);
 for lay=1:NofLayers+1
             if(lay==1) %First connections
                 netH=W{lay}'*X;
                 H{lay}=sigmoid(netH);
             elseif (lay==NofLayers+1) %Last connections
                 netO=W{lay}'*H{lay-1};
                 Out=round(sigmoid(netO));
             else % between connections layers
                 netH=W{lay}'*H{lay-1}; 
                 H{lay}=sigmoid(netH);
             end
 end
fprintf('Result is: %d\n', Out);
%**********************Helper Functions*********
%Sigmoid Activation Function
function y = sigmoid(x)
    y = 1 ./ (1 + exp(-x));
end
You can run and test it with the files that provided in zip file. In this kind of model as I know I need to try it with high number of layer and neuron. I tried with 4-20 5-30 5-35 ... Generally it returns 1 and this is the problem that I am struggling with.
If you can give any comment, feedback I would appreciate it. Again thank you for giving a time.
0 Comments
Answers (1)
  Shivansh
      
 on 29 Jun 2024
        Hi Omar!
It seems like your model is predicting the label "1" more often and might be overfitted on it. 
The implementation of the MLP looks fine and should be able to provide better results for this problem. 
There are a few areas in your code that might improve the performance of your model. 
The first issue can be the distribution of classes in the training and testing dataset. Try oversampling or undersampling techniques in case of unbalanced classes in the dataset. 
You can use normal distribution with smaller values for weights initialization to prevent the possible saturation of the sigmoid function. You can modify the weight initialization to use the "randn" method and multiply it by a constant of ~0.05.
The current learning rate of 0.5 might be a little high for your problem. Try reducing the learning rate and analyze the impact on the model. 
I also didn't see any bias terms in the provided code. The inclusion of bias terms can impact the model significantly. 
The "sigmoid" function might be fine for the model but you might want to experiment with different activation functions like "relu" or "tanh". 
You can try these changes and analyze the impact on the model to find the issue. 
You can refer to the following documentation for more information on "randn":
I hope it helps in resolving the issue. 
0 Comments
See Also
Categories
				Find more on Deep Learning Toolbox in Help Center and File Exchange
			
	Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!
