Training data and Training target in Neural Networks

6 views (last 30 days)
I am having a signal in form of vector (1*25000). I want to split this signal into four parts x_train, y_train, x_test and y_test (according to 70-30% training and testing method) in MATLAB. Can anyone help me how to split this vector form signal into these four parts?
MAT-Magic on 5 Feb 2020
Edited: MAT-Magic on 7 Feb 2020
Thanks for the answer. Here, I am confused becuase I just have recorded respiratory signal in this array 1*25000, which is unlabled data according to my understanding without any training target. But for neural networks training on training data, I need to have the corresponding training target, which is I am not having right now.
Anyways, I did it in the following way. Can you please review the code, and tell me whether I am in the right track or not?
signal = data(1:25000);
[m,n] = size(signal);
P = 0.70 ;
idx = randperm(m);
train_data = signal(idx(1:round(P*m))); %% 17500*1 (dimension)
test_data = signal(idx(round(P*m)+1:end)); %% 7500*1 (dimension)
%% For training data:
colnr_1 = 2;
rownr_1 = 17500/2;
mat_1 = reshape(train_data, [rownr_1, colnr_1]);
x_train = mat_1(:,1);
y_train = mat_1(:,2);
%% For testing data:
colnr_2 = 2;
rownr_2 = 7500/2;
mat_2 = reshape(test_data, [rownr_2, colnr_2]);
x_test = mat_2(:,1);
y_test = mat_2(:,2);
Waiting for the positive feedback. Correct me If I am wrong anywhere. Thanks.
Please go through this below URL, it might be related to my problem.

Sign in to comment.

Accepted Answer

Greg Heath
Greg Heath on 9 Feb 2020
You cannot make any intelligent decisions until you have examined a plot of the data!!!
(WRONG!!! Plotting the data first is the ultimate beginning decision!!!)
Hope this helps.

More Answers (1)

Mahesh Taparia
Mahesh Taparia on 7 Feb 2020
You have correctly divided the data using randperm. Since you didn’t have ground truth, you are taking last 8750 as ground truth as per following code:
mat_1 = reshape(train_data, [rownr_1, colnr_1]);
x_train = mat_1(:,1);
y_train = mat_1(:,2);
which is incorrect. Select the correct ground truth.
Mahesh Taparia
Mahesh Taparia on 10 Feb 2020
You mentioned earlier that your dataset is unlabeled, y_train would be the labels of x_train. Taking y_train (labels of x_train) as half of the data (which is amplitude) is illogical.
For supervised learning, there is a need of ground truth so collect the labels. Or else you can try with unsupervised learning approach like clusteriung.

Sign in to comment.


Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!