Clear Filters
Clear Filters

Deleted table rows stuck in memory? Cannot fit a linear model.

3 views (last 30 days)
I have a data table with both continuous and categorical values. I want to run a linear model for this table using 'fitlm'. I have a loop where I pick a different subset of rows and fit a model for it.
However, it appears that I cannot do slicing for categorical variables. Fitlm sees all possible categories of the full table and complains "Warning: Regression design matrix is rank deficient to within machine precision.". The non-existing categories also appear in the model. Even creating a temporary table from numerical matrix does not help!
Here is an example. I don't understand why one categorical factor (condition 3) won't go away.
% data with 3 categories
data = [...
1.9,1;
5.7,2;
0.7,1;
2.2,2;
0,1;
1.9,2;
-0.2,1;
1.6,2;
-0.7,1;
2.3,2;
1,3];
% create table
data_table = array2table(data,'VariableNames',{'Y','Condition'});
% make condition as categorical
data_table.Condition=categorical(data_table.Condition);
% fit linear model (basically a t-test)
model1 = fitlm(data_table,'Y ~ 1 + Condition');
% this works, but condition 3 is basically useless with only 1 sample
% Lets remove the final row and condition 3
data_table = data_table(1:end-1,:);
% repeat with sliced table (only 2 categories remains)
model2 = fitlm(data_table,'Y ~ 1 + Condition');
% We get a warning. Condition 3 is still there with no data.
% Create a new table from a numerical array
mat = table2cell(data_table);
new_data_table = cell2table(mat,'VariableNames',{'Y','Condition'});
new_data_table.Condition=categorical(new_data_table.Condition);
% no category 3 in the new table
model3 = fitlm(new_data_table,'Y ~ 1 + Condition');
% still the same warning even if there never was condition 3 in this table
% ok, lets clear old tables and start from the cell matrix
clear data_table new_data_table data;
new_new_data_table = cell2table(mat,'VariableNames',{'Y','Condition'});
new_new_data_table.Condition=categorical(new_new_data_table.Condition);
% again, no category 3 in the new table
model4 = fitlm(new_new_data_table,'Y ~ 1 + Condition');
% still the same warning, condition 3 remains
ADDITION:
In the latest version of Matlab I could probably use "removecats" to delete non-existing categories. However, this function is not available in r2017b.

Accepted Answer

Hang Yu
Hang Yu on 28 Jun 2018
Hi Janne,
Your approach to use removecats seems correct, however it is available in R2017b if you have the correct MATLAB installed. After you truncated the table, you can do
>> data_table.Condition = removecats(data_table.Condition);
and you don't have to convert the table anymore.
If you still can't find the removecats function for some reason, try to convert the categorical array in to double array by using
>> data_table.Condition = double(data_table.Condition);
followed by categorical again.
  3 Comments
Janne Kauttonen
Janne Kauttonen on 29 Jun 2018
I'll accept this answer now, thank you. However, it would be great if someone could explain this table behavior more in-depth (why and how).
meng lei
meng lei on 3 Mar 2021
The category is stored in the properties of the table and will be inherited from the previous table

Sign in to comment.

More Answers (0)

Products


Release

R2017b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!