How to split a dataset into training/validation images, assuming I have multiple subfolders ?

1 view (last 30 days)
Hi everyone,
Assume I have three different groups (three animals) for a network I would like to train : Dogs, Cats and Cows.
For each class, I have, say, 10 images. Now the thing is that each of these images also has a subfolder containing multiple patches (cropped out of each image). So the path would look like this:
all.classes / dog.images / dog.image.1 / patch.dog.image.1.1.png
I would like to randomly split the entire dataset into training/validation images, but instead of working at the patch level, I would like to do so at the image level. For instance : all patches of dog.image.1, dog.image.2 and dog.image.3 will be used for validation while the rest (patches of dog.image.4 to dog.image.10) will be used for training. In other words, I do not want to mix all patches of all 10 images in a single pool and randomly draw 70% for training and 30% for validation.
I usually do the following :
imds = imageDatastore ('all.classes', ...
'IncludeSubfolders', true, ...
'LabelSource', 'foldernames');
[imdsTrain, imdsValidation] = splitEachLabel (imds, 0.7, 'randomized')
If possible, how can I modify this code in order to divide my dataset at the image level instead?
Thank you very much!
Edit : The classes have different numbers of images each. n = 10 was used for simplification purposes only.

Accepted Answer

Anmol Dhiman
Anmol Dhiman on 22 Jul 2020
Hi M J,
In my opinion there is no direct way to do so. You can seperate both training and validation manually or programitacally ( link) and apply imageDatastore individually.
Regards,
Anmol Dhiman

More Answers (0)

Categories

Find more on Image Data Workflows in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!