Why is my test accuracy higher than validation accuracy?
14 views (last 30 days)
Show older comments
I used the classification learner app in Matlab. My model has a validation accuracy of 60.6% and a test accuracy of 72.0%. I know that the test set could be a ''lucky'' better set, but could there also be other reasons for this big difference?
0 Comments
Accepted Answer
John D'Errico
on 20 Sep 2024
Edited: John D'Errico
on 20 Sep 2024
Not really. It might be a reflection that you needed more data, that your sets are just not large enough. The law of large numbers requires larger sets of data for expected behavior to prevail.
I might also add there is a lot of confusion about these terms. I prefer "training" accuracy to describe the statistics on the set used to train the model. Then my preference is to call the secondary tests "validation", to learn how well the model fits to other data, not used in the training step. But that need not be standard. The use of training though does make the difference explicit, at least in my eyes.
Typically, the training accuracy would be a little better than the validation accuracy, because no matter what, there will always be some component of overfitting. This is unavoidable. So if the numbers go the other way, then luck and random chance played a part.
As for the difference being a big one, again, that may well be a function of the quantity of data available.
0 Comments
More Answers (0)
See Also
Categories
Find more on Analysis of Variance and Covariance in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!