How can I increase classification accuracy after feature extraction

I need to get a higher CA than the current value which is random between 60-70 due to randomized AEK_C1/C2 and VK_C1/C2.
clear
close all
clc
load TrainData.mat
load LabelTrain.mat
%dividing train group as C1 and C2
ndx1=(LabelTrain==1);
TrainDataC1=TrainData(:,:,ndx1);
ndx2=(LabelTrain==2);
TrainDataC2=TrainData(:,:,ndx2);
%Assigning half of C1 to AEK and the other half to VK, RANDOMLY
r1=randomsayi(1,70,70);
AEK_C1=TrainDataC1(:,:,r1(1:35,1));
VK_C1=TrainDataC1(:,:,r1(36:70,1));
%Assigning half of C2 to AEK and the other half to VK, RANDOMLY
r2=randomsayi(1,70,70);
AEK_C2=TrainDataC2(:,:,r2(1:35,1));
VK_C2=TrainDataC2(:,:,r2(36:70,1));
AEK=cat(3,AEK_C1,AEK_C2);
AEK_Label= [ones(1,35) 2*ones(1,35)];
VK=cat(3,VK_C1,VK_C2);
VK_Label= [ones(1,35) 2*ones(1,35)];
%clearvars -except AEK VK AEK_Label VK_Label
clear LabelTrain
E=2;
for i=1:70
Trial=AEK(:,E,i);
FV_AEK(i,:)=[std(Trial) kurtosis(Trial)];
end
for i=1:70
Trial=VK(:,E,i);
FV_VK(i,:)=[std(Trial) kurtosis(Trial)];
end
%Validation process
for k=1:35
class = knnclassify(FV_VK,FV_AEK,AEK_Label,k,'euclidean','nearest'); %prediction oc labels
cp=classperf(VK_Label,class); %comparement with predicted and actual labels
CA(k)=cp.CorrectRate*100;%Select the classification accuracy value
end
[BestCA k]=max(CA)
FV_Train=[FV_VK;FV_AEK];
TrainLabel2= [VK_Label AEK_Label];
load TestData
for i=1:140
Trial=TestData(:,E,i);
FV_Test(i,:)=[std(Trial) kurtosis(Trial)];
end
TestLabel_MG = knnclassify(FV_Test,FV_Train,TrainLabel2,k,'euclidean','nearest'); %prediction oc labels
[BestCA k]=max(CA)
% plot(FV(1:70,1),FV(1:70,2),'+')
% hold on
% plot(FV(71:140,1),FV(71:140,2),'ro')
This is the base code given to me. It uses feature extraction and if I add another method like skewness or remove one of them, it doesn't change the output. So I thought maybe after extraction I can use feature selection so it can give a better result but either i couldn't apply it properly or it doesn't change anything, probably the first one. If it will improve the results which datas should I use in the feature selection (e.g. idx = fscmrmr(X,y), which datas I should use insted X and y)? I tried feature selection because we study that in school, but if there is a way without selection I'm open to that too. I also add the datas and codes as attachment so you can see and use to get a better idea. Thank you.

Answers (1)

To get better classification accuracy you can either pick better features to measure, use more training samples, or pick a better classification algorithm. There are ways to determine which features are the most important ones, but it involves partial least squares regression and can get pretty tricky.

7 Comments

We use these sets and this given code so the only option is picking the most relevant features to improve the accuracy. I'm thinking NCA and mRMR are better for this project as numerical input and categorical output. But unfortunately I couldn't implement them properly. This is very confusing to me.
How many features for each observation do you have? StDev, kurtosis, and what else?
Have you tried using the Classification Learner app on the App tab of the tool ribbon?
Our base TestData and TrainData is 768x3x140 double and after I run the code TrainData will split into two and four 768x3x35 and use them to to create FV_Test, Train, AEK, VK etc. which are 140x2 and 70x2. I can also send screenshot of my workspace if you want. StDev and kurtosis are used in this code but even if I add skewness doesn't change the result, same way if I take out lets say kurtosis, same result.
I have used Classification Learner app but shows that accuracy is 60%, same as the code.
Someone proposed me that after feature selection using MRMR, apply PCA and classify after it. I will try that also but I don't know about PCA that much. Do you think it will help to improve accuracy?
What does each of the three dimensions represent?
It's like this for 140 times with 768x3 numerical data: (val(:,:,1).... val(:,:,2)....... val(:,:,140)).
After feature extraction it looks like this:
This is FV_AEK after feature extraction.
What do each of the columns represent?
Sorry for the late reply. Our given data is a 3 channel EEG signal.

Sign in to comment.

Asked:

on 30 Dec 2023

Commented:

on 31 Dec 2023

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!