Removing NaN in Linear Regression Problem. Error in line 66.
20 views (last 30 days)
Show older comments
Hello guys,
I am trying to conduct a multivariable linear regression problem. The predictors (X) form a table sized 52824x9.
When trying to remove all the NaN values using this piece of code, included in the regress function:
% Remove missing values, if any
wasnan = (isnan(y) | any(isnan(X),2)); %line 66
havenans = any(wasnan);
if havenans
y(wasnan) = []; %line 69
X(wasnan,:) = [];
n = length(y);
end
At first, I got an error stating:
Undefined function 'isnan' for input arguments of type 'table'.
Error in regress (line 66)
wasnan = (isnan(y) | any(isnan(X),2));
I searched for solutions, and I was able to find one saying that isnan function is not able to access data from tables, and the provided solution was to include the following:
wasnan = (isnan(y{:,:}) | any(isnan(X{:,:}),2));
Now I get an error in line 69 saying the following:
Subscripting a table using linear indexing (one subscript) or multidimensional indexing (three or more subscripts) is not
supported. Use a row subscript and a variable subscript.
If anyone knew how to solve the problem or to provide another solution for accessing data with the isnan function, it would be very much appreciated. I have been trying to solve this problem for some days now.
Many thanks,
Natalia
0 Comments
Accepted Answer
dpb
on 18 Mar 2020
Edited: dpb
on 20 Mar 2020
You don't pass the table to regress but the variables to be used in the regression -- then you won't run into the issue inside regress.
And you DEFINITELY DO NOT WANT TO BE MUCKING INSIDE THE SUPPLIED REGRESS FUNCTION!!!!
We don't know the function you're trying to fit nor the variable names in your table, but assuming
Y ~ 1 + AX1 + BX2 + ...
for variables X and Y in the table and a linear model plus intercept, then the syntax for regress would be
b=regress(t.X,[ones(height(t),1) t.Y]);
where the table variable is t. Use your table variable name and variable names within the table, of course.
If you have the Curve Fitting Toolbox besides Statistics, I would suggest that the fit function in it is a little more user friendly than the core regress function. Lacking it, see the Alternative Functionality section of the documentation for regress that suggests using LinearModel instead for similar reasons/purposes.
Read the section in the documentation for table on how to address data within a table for the details of using tables and which forms of addressing return the variables as native type, tables, etc., ... But, in particular note that addressing a table variable with parentheses returns another table of the addressed rows and columns within the table which is probably the root cause of your troubles.
x=t(:,1); % returns x as a table all rows of table t, column 1
while
x=t.X; % presuming X is the first column in table t returns X as an array
% or
x=t{:,1}; % returns x as a array -- NB: the "curlies" {} instead of ()
More Answers (1)
Cris LaPierre
on 18 Mar 2020
5 Comments
Cris LaPierre
on 18 Mar 2020
Ah, I didn't realize that code snippet was from the regress function. Yes, don't go changing code inside the function. Use this to clean up your table before passing it to regress.
And yes, regress does not support tables as inputs. Use the dot notation to pass in variables.
See Also
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!