Question on CATEGORICAL and Help files

Well, not a question as much as a comment. I'm finding the Help files a bit too brief for this class. Does any one else agree?
For example:
  • How to preallocate a categorical array? I'm dealing with 13,000,000 records and moving the data into variables in 32 bit machine hits the memory limit quickly. Preallocation is critical inthis application.
  • How to turn a categorical array back into numbers of characters? Hehe, trying to use the variable editor and paste the categorical data into a column of another (non categorical) variable, froze the machine (or was it a crash?).

 Accepted Answer

I usually preallocate tables or categoricals dynamically by running my loop backward
T = table;
for ii = 10:-1:1;
T(ii,:) = array2table(rand(1,4));
end
To your second question:
c = categorical({'red';'red';'blue'})
cellstr(c)

9 Comments

Good to know can use T=Categorical without arguments. Nice simple trick to loop backwards.
Simple and obvious, thanks.
How does running loop backwards help on categoricals? Don't see that...
@dpb
C = categorical;
for ii = 10:-1:1
C(ii,:) = char(randi(10)+65);
end
If you don't have all of the values up front (say we're reading from 100 files), this allows you to preallocate elegantly. At least its the most elegant thing I've found :)
>> c=categorical
Error using categorical
Abstract classes cannot be instantiated. Class 'categorical' defines abstract methods
and/or properties.
>>
This must be newer feature.
Yes, you are likely using categorical In The Stats Toolbox from < R2013b when the new one (along with tables) was introduced into base MATLAB.
This is 12b, indeed. It brought my old machine almost to its knees so have been reluctant to upgrade further figuring performance would degrade even further...
The machine or the ML desktop?
Both...I'm not at all enamored of the idea of changing the UI, either...I use very little of it other than keyboard, anyway.
In addition to what Sean said about preallocating the memory using a backwards loop
  1. you can also assign any value to the last element and then run the loop in the usual direction, and
  2. it will be a performance gain to also preallocate the categories if you know them in advance.
So
c(1000,1) = categorical('',{'abc' 'def' 'ghi'});
Presto, a 1000x1 categorical array. You could of course also do this
c = categorical(repmat({''},1000,1),{'abc' 'def' 'ghi'});
but there's no real reason to.

Sign in to comment.

More Answers (1)

a) Can't -- nominal or ordinal create the categorical array from an existing array--no other method is provided.
b) double and various intXX are numeric conversions; cellstr or char for character data
See
doc categorical
for details.
If data are character, you may in the end save memory with such manipulations as
x=nominal(x);
if x is a character variable but you'll have to have the original x initially. If you're running into memory problems loading the data to begin with, about all you can do that I can think of is to load it piecemeal, convert to categorical with a (hopefully sizable) savings in memory that then allows you to load some more.
Or, of course, find a way to process the data other than "all in one swell foop"

Categories

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!