Specify Indices for Training, Validation and Testing
6 views (last 30 days)
I am trying to evalutate a standard set of regression and classification problem as part of a team at work. My normal workflow is:
- Use the CVPARTITION function to automatically create training and validation dataset partitions.
- Train my model (e.g. via fitrensemble, fitcensemble, etc.)
- Use trained model on a "test" data sets to evalate predictability of the trained model.
This is hopefully a pretty standard use case and work flow. Here is my problem:
I want to be able manually specify the data partition indices for the "training" and "validation" data split myself. It's like using a holdout validation, except instead of letting Matlab randomly sample the indices, I would manually specify which points to use for training and which to use for validation. Unfortuantely I can't seem to figure out a way to just provide these partitions myself without having Matlab control the random sampling.
For perspective: when using a shallow neural network, we can do this by simply using the 'divideind' option and then providing the vector of indices for each split directly.
net.divideFcn = 'divideind';
This is somewhat important to my problem because I am working with a team of other scientists who use different software packages and approaches. For consistency and to allow comparison between models, we like to use the same data sets for training, validation and testing. However, I am not able to currently use the same validation splits in Matlab.
Am I missing a function or option in Matlab to do this? Any suggestions are welcome!