lasso glm cvpartition error

10 views (last 30 days)
V
V on 22 Aug 2018
Commented: James Gerber on 28 Nov 2023
Hello!
I am trying to run lassoglm with cv and I seem to receive the following error:
_Error using lassoglm>processLassoParameters (line 742) There are too few observations for crossvalidation. At least one training set has fewer than two observations.
Error in lassoglm (line 260) processLassoParameters(X,Y,pwts,alpha,nLambda,lambdaRatio,lambda,dfmax, ..._
I first run cvp = cvpartition(groups,'k',5), I also tried 2 and 3 folds Then I run [B, FitInfo] = lassoglm(X,Y,'binomial','CV',cvp); I have two groups(size(group1)= 54, size(group2)=78) Any help to troubleshoot this would be very much appreciated.
Many thanks, V

Answers (1)

Amal George M
Amal George M on 31 Aug 2018
Hi V,
You encounter this message as the function needs partitions that match the total number of observations (after stripping NaNs and Infs and zero observation weights). Lassoglm ignores any rows which have at least one NaN or Inf values. Therefore, you end up with, a lesser number of observations. However, 'cvpartition' has already partitioned the original observations, which causes a mismatch in the number of observations and this error message.
Please try removing all the missing values (and the corresponding index from the response variable 'groups') before you pass the data to 'cvpartition'.
For example, you can remove all the rows which contain a NaN value by doing the following: >> idx2Remove = any(isnan(X), 2); >> X(idx2Remove, :)= []; >> groups(idx2Remove) = [];
  2 Comments
Jason Climer
Jason Climer on 24 Jul 2023
This is a terrible error message for this problem. It's clear that the potential for an error was handled but the two causes of the error were lumped together on line 813 of lassoglm. This should be updated.
James Gerber
James Gerber on 28 Nov 2023
There is a similar problem in lasso.m on line 524 in R2023b with a misleading error message. There I got the error "There are too few observations for crossvalidation. At least one training set has fewer than two observations." and the reason was that there were a few zero weights.

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!