Is there a faster way of splitting a cell array into numeric array while preserving NaN?

Question

0 votes

data.mat

Greetings,

I am trying to split a set of data into rows and columns of numeric data that will preserve the position of empty data (as NaN or anything similar).

The input data is a cell array with rows of strings. The columns are delimited by a semi-colon ' ; '. The first 8 columns are filled with garbage data and there are many trailing columns with no data at all. I even sometimes have rows with no data. The attached data sample is just 4,000 rows long but I actually have datasets that have between 50,000 and 300,000 rows.

I have been using the code below but the str2double step is incredibly slow. Can anyone offer an alternative approach that can cut down on the processing time?

% split data by the ' ; ' separator
data = cellfun(@(x) split(x,';'),data,'UniformOutput',false);
% get rid of preceding garbage data in columns 1 to 8
data = cellfun(@(x) x(9:end),data,'UniformOutput',false);
% convert data into double. This step is incredibly slow
data = cellfun(@str2double,data,'UniformOutput',false);
% example of next operations I wish to perform on this data
data_a = cellfun(@(x) x(1:2:end),data,'UniformOutput',false);
data_b = cellfun(@(x) x(2:2:end),data,'UniformOutput',false);

Thank you in advance for any help

3 Comments
Show 1 older comment Hide 1 older comment

Alex Wolf on 22 Aug 2019

This offers a fairly good improvement to my code. Thank you for your suggestion.

Adam Danz on 22 Aug 2019

Thank you for the feedback. I've never tried the function myself.

Sign in to comment.

Sign in to answer this question.

Follow Question

Answer 1

TADA on 22 Aug 2019

Edited: TADA on 22 Aug 2019

Open in MATLAB Online

2 votes

try this

endsWithSemicolon = cellfun(@(s) endsWith(s, ';'), data);
x = cellfun(@(s) textscan(s, '%f', 'Delimiter', ';', 'EmptyValue', nan(), 'Whitespace', ' *\n\t\r\b'), data);
x = cellfun(@(a) a(9:end), x, 'UniformOutput', false);
x(endsWithSemicolon) = cellfun(@(a) [a; nan], x(endsWithSemicolon), 'UniformOutput', false);

4 Comments
Show 2 older comments Hide 2 older comments

Adam Danz on 23 Aug 2019

Edited: Adam Danz on 23 Aug 2019

teamwork +1

:D

TADA on 23 Aug 2019

:)

Cheers

Sign in to comment.

Is there a faster way of splitting a cell array into numeric array while preserving NaN?

3 Comments
Show 1 older comment Hide 1 older comment

Accepted Answer

4 Comments
Show 2 older comments Hide 2 older comments

More Answers (0)

Categories

Products

Release

Tags

Community Treasure Hunt

Is there a faster way of splitting a cell array into numeric array while preserving NaN?

3 Comments Show 1 older comment Hide 1 older comment

Accepted Answer

4 Comments Show 2 older comments Hide 2 older comments

More Answers (0)

Categories

Products

Release

Tags

See Also

Community Treasure Hunt

3 Comments
Show 1 older comment Hide 1 older comment

4 Comments
Show 2 older comments Hide 2 older comments