lasso glm cvpartition error

6 views (last 30 days)
V on 22 Aug 2018
Answered: Amal George M on 31 Aug 2018
I am trying to run lassoglm with cv and I seem to receive the following error:
_Error using lassoglm>processLassoParameters (line 742) There are too few observations for crossvalidation. At least one training set has fewer than two observations.
Error in lassoglm (line 260) processLassoParameters(X,Y,pwts,alpha,nLambda,lambdaRatio,lambda,dfmax, ..._
I first run cvp = cvpartition(groups,'k',5), I also tried 2 and 3 folds Then I run [B, FitInfo] = lassoglm(X,Y,'binomial','CV',cvp); I have two groups(size(group1)= 54, size(group2)=78) Any help to troubleshoot this would be very much appreciated.
Many thanks, V

Answers (1)

Amal George M
Amal George M on 31 Aug 2018
Hi V,
You encounter this message as the function needs partitions that match the total number of observations (after stripping NaNs and Infs and zero observation weights). Lassoglm ignores any rows which have at least one NaN or Inf values. Therefore, you end up with, a lesser number of observations. However, 'cvpartition' has already partitioned the original observations, which causes a mismatch in the number of observations and this error message.
Please try removing all the missing values (and the corresponding index from the response variable 'groups') before you pass the data to 'cvpartition'.
For example, you can remove all the rows which contain a NaN value by doing the following: >> idx2Remove = any(isnan(X), 2); >> X(idx2Remove, :)= []; >> groups(idx2Remove) = [];

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!