How to prevent tabulate function returning frequencies of variables that are not present in the vector?
10 views (last 30 days)
Show older comments
I have the vector v1, with 26 elements, either being 'Live' or 'Non-live'.
When I use tabulate(v1), it returns
tabulate(v1)
Value Count Percent
cat1 0 0.00%
cat2 0 0.00%
Live 5 19.23%
Non-live 21 80.77%
This affects crosstab also. How to prevent tabulate and crosstab functions taking non-existing variables in the vector in to account?
2 Comments
Stephen23
on 16 Apr 2025 at 4:37
Edited: Stephen23
on 16 Apr 2025 at 4:48
"How to prevent tabulate function returning frequencies of variables that are not present in the vector?"
Most likely they are present in your vector.
The example is easy to replicate by providing TABULATE with a categorical vector containing those four categories:
v1 = categorical(repelem(["Live","Non-live"],[5,21]),["cat1","cat2","Live","Non-live"])
tabulate(v1)
"How to prevent tabulate and crosstab functions taking non-existing variables in the vector in to account?"
Most likely those categories do exist in the data. Therefore the simplest solution is to not create those categories in the vector.
Answers (1)
Walter Roberson
on 16 Apr 2025 at 4:45
The problem is that v1 was created as a categorical with additional categories beyond the ones populated.
I do not know of an efficient way to strip off the additional categories. One work-around is
v1cats = categories(v1);
v1stripped = categorical(v1cats(uint32(v1)));
tabulate(v1stripped)
You can adjust the uint32 according to the maximum number of categories the object holds. In this particular case you could use uint8
0 Comments
See Also
Categories
Find more on Octave in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!