How do I partition data sets for cross validation in MATLAB?

17 views (last 30 days)
I'm trying to split, or partition, the data into two groups. Testing Data and Training Data. Ideally I want to write a function that can randomly divide the data into a variable sized patition. So that I could do specifi and leave one out cross validation. I'm not sure how I'll do this though.

Accepted Answer

Swetha Polemoni
Swetha Polemoni on 16 Sep 2021
Hi
It is my understanding that you want to partition the dats randomly. You can use either cvpartition from Statistics and Machine Learning Toolbox or randperm to serve the purpose.
The following answer might help you to understand how to use 'randperm'

More Answers (1)

Enrico
Enrico on 22 Jan 2026 at 10:13
I am trying to use the cvpartition object but I'm struggling understanding its main structure.
Based on what I know, when I want to perform a CV partition, let's say with 10 folds, I'm supposed to make these folds randomized. however when I create a cvpartition object (e.g. cvpartition(200000, "KFold",10), I get a result that looks like:
K-fold cross validation partition
NumObservations: 200000
NumTestSets: 10
TrainSize: 180000 180000 180000 180000 180000 180000 180000 180000 180000 180000
TestSize: 20000 20000 20000 20000 20000 20000 20000 20000 20000 20000
IsCustom: 0
However, I can't understand how are these training and test sets organized. I mean I was expecting have some place that tells me what indexes are in each training and test folds and yet I can't see where they are.
If I create two cvpartitions, with the same data and try to compare them I obtain:
isequal(cv1, cv2)
ans =
logical
0
However, I can't find their differences. The structure that defines them seems the same.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!