How to replace anomalous 0x0 double elements from a cell array of "character vectors"?

53 views (last 30 days)
I am creating a decently big table where apparently some rows contain missing data. These missing data elements are stored as 0x0 double ([]) by default, where the variable is actually expected to be of character vector type. This essentially converts the column into a generic cell array instead of the required character vector cell array. Thus, making it impossible to convert the field into categorical or change data types or use unique function etc.
Error message on using categorical function with this cell array as primary argument:
Error using categorical (line 358)
Could not find unique values in DATA using the UNIQUE function.
Caused by:
Error using matlab.internal.math.uniqueCellstrHelper
Cell array input must be a cell array of character vectors.
As I require the data contained in the other variables of the same row, I don't want to remove these rows entirely. What should be done to replace these anomalous double values by some character (eg: NA)? I have tried using standardizeMissing but wasn't able to come up with a satisfactory solution.
Example data:
>> disp(a)
{'A' }
{'B' }
{0×0 double}
{'B' }
{'C' }

Accepted Answer

KSSV
KSSV on 8 Jun 2021
a = [{'A' }
{'B' }
{0}
{'B' }
{'C' }] ;
idx = cellfun(@ischar,a) ;
a(~idx) = {'NA'}

More Answers (2)

Deepu George Kurian
Deepu George Kurian on 8 Jun 2021
I just happened to stumble upon a shorter solution.
a = cellstr(char(a{:}));
While this doesn't exactly replace the empty double (0x0 double) with 'NA' (as I mentioned in the question), it does replace it with an empty character (0x0 char). This essentially changed the whole cell back to a character vector type that I can work with.

Mike
Mike on 15 Mar 2023
This must be a bug.
However, this is the general answer:
cstr = cell(1,100); %generates empty cell array of size [1,100]
cstr = cellfun(@char,cstr,'UniformOutput',false); % convert each cell to char type by using cellfun()
  1 Comment
Steven Lord
Steven Lord on 15 Mar 2023
No, this behavior is not a bug. If you create a cell array by assigning to an element that is not already populated, MATLAB assigns to that element of the cell array. But it needs to assign something to the elements between the previous end of the cell array and the newly assigned element. That something is a cell with a placeholder, and the placeholder MATLAB chooses is []. In this example by assigning to the third element of A MATLAB needs to fill in the second element as well.
A = {1}
A = 1×1 cell array
{[1]}
A{3} = 3
A = 1×3 cell array
{[1]} {0×0 double} {[3]}
placeholder = A{2}
placeholder = []
You could argue that MATLAB should look at the previous values in the cell to determine what to use as a placeholder, but if the cell is large that could take time. Even worse, since cells are not required to be homogenous determining the "right" context-aware placeholder is could be complicated. As an example If I ran the following two lines of code, what would you expect the "right" value of d{4} to be? A char array, a numeric array, a function handle (and if so to what function?), or something else?
d = {'hello', 42, @sin}
d = 1×3 cell array
{'hello'} {[42]} {@sin}
d{5} = figure
d = 1×5 cell array
{'hello'} {[42]} {@sin} {0×0 double} {1×1 Figure}
In practice, MATLAB uses the same placeholder in this case as it does in the first example.
placeholder2 = d{4}
placeholder2 = []

Sign in to comment.

Categories

Find more on Data Type Conversion in Help Center and File Exchange

Products


Release

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!