Selecting the same amount of data for all categories based on different labels

1 view (last 30 days)
Hello. I am trying to edit the data in the input array above so that the new_data array has the same amount of bad,good and excellent quality trees for training of an AI. The bad, good and excellent are given by a label where [1; 0 ; 0] is bad, [0; 1 ; 0] is good and [0; 0 ; 1] is excellent. The bad trees have the least amount of data to train the AI so I am trying to reduce the good and and excellent data so they are all the same.
cvs_data =csvread('spruce tree timber quality.csv',1,0)
expert_ratings = cvs_data(:,12)' %inputs
inputs = cvs_data(:,1:11)
correct_labels = zeros(3,length(expert_ratings));
bad = 0;
good=0;
excellent=0;
histogram(expert_ratings)
Skew1 = skewness(expert_ratings)
for k =1 : length(expert_ratings)%length(expert_ratings)
quality = cvs_data(k,12)';
if quality >= 1 && quality <=4
bad = bad+1;
correct_labels(1:3, k) =[1; 0 ; 0];
end
if quality >= 5 && quality <=6
good = good+1;
correct_labels(1:3, k) =[0; 1 ; 0];
end
if quality >= 7 && quality <=10
excellent = excellent+1;
correct_labels(1:3, k) =[0; 0 ; 1];
end
end
[good_inputs,rows_rem] = rmoutliers(inputs,"mean")
transposed_rows = rows_rem'
good_inputs_tp = good_inputs'
correct_labels(:,transposed_rows)=[];
correct_labels
%select the same amount of bad, good and excellent data
correct_labels
good_inputs'
bad
good
excellent
for column = 1:length(bad)
if correct_labels(:,column) == [0 1 0]
new_data(column) = inputs(column)
end
end
new_data
I tried to do this in the last part of the code but it doesn't seem to be working. Any help would be greatly appreciated.

Accepted Answer

David Hill
David Hill on 10 Oct 2022
cvs_data =readmatrix('spruce tree timber quality.csv');
quality=cvs_data(:,12);
bad_data=cvs_data(quality<=4,1:11);
good_data=cvs_data(quality>=5&quality<=6,1:11);
excellent_data=cvs_data(quality>=7,1:11);
m=min([size(bad_data,1),size(good_data,1),size(excellent_data,1)]);
newdata=[bad_data(randperm(m),:);good_data(randperm(m),:);excellent_data(randperm(m),:)];

More Answers (0)

Categories

Find more on Data Import from MATLAB in Help Center and File Exchange

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!