How to split a dataset into training/validation images, assuming I have multiple subfolders ?
1 view (last 30 days)
Show older comments
Hi everyone,
Assume I have three different groups (three animals) for a network I would like to train : Dogs, Cats and Cows.
For each class, I have, say, 10 images. Now the thing is that each of these images also has a subfolder containing multiple patches (cropped out of each image). So the path would look like this:
all.classes / dog.images / dog.image.1 / patch.dog.image.1.1.png
I would like to randomly split the entire dataset into training/validation images, but instead of working at the patch level, I would like to do so at the image level. For instance : all patches of dog.image.1, dog.image.2 and dog.image.3 will be used for validation while the rest (patches of dog.image.4 to dog.image.10) will be used for training. In other words, I do not want to mix all patches of all 10 images in a single pool and randomly draw 70% for training and 30% for validation.
I usually do the following :
imds = imageDatastore ('all.classes', ...
'IncludeSubfolders', true, ...
'LabelSource', 'foldernames');
[imdsTrain, imdsValidation] = splitEachLabel (imds, 0.7, 'randomized')
If possible, how can I modify this code in order to divide my dataset at the image level instead?
Thank you very much!
Edit : The classes have different numbers of images each. n = 10 was used for simplification purposes only.
0 Comments
Accepted Answer
Anmol Dhiman
on 22 Jul 2020
Hi M J,
In my opinion there is no direct way to do so. You can seperate both training and validation manually or programitacally ( link) and apply imageDatastore individually.
Regards,
Anmol Dhiman
0 Comments
More Answers (0)
See Also
Categories
Find more on Image Data Workflows in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!