How to make str2double recognize comma delimited numbers?

I'm reading in an excel file and I have a column that I need to convert all the values from string to text.
Currently Im using:
data.col1 = str2double(data.col1);
and this is merging the values in my column.
For example, it goes from '2,3' to 23. Is there a way I can do this, but make it so it recognizes it is 2 separate numbers 2 and 3?
Thanks!

 Accepted Answer

It depends on what the actual numbers are. You can likely simplify this to use only '\d*' if they are only integers.
One approach —
data.col1 = {'2,3';'42';'1.23';'4.56,7.89';'21,3.14';'2.718,99'};
data.col1 = regexp(data.col1, '\d*\.\d*|\d*', 'match')
data = struct with fields:
col1: {6x1 cell}
data.col1 = cellfun(@str2double,data.col1, Unif=0)
data = struct with fields:
col1: {6x1 cell}
data.col1
ans = 6x1 cell array
{[ 2 3]} {[ 42]} {[ 1.2300]} {[4.5600 7.8900]} {[ 21 3.1400]} {[ 2.7180 99]}
EDIT — (19 Mar 2025 at 17:46)
Added rows to ‘data.col1’ to illustrate its results. It would be helpful to know more about whatt it actually is.
.

4 Comments

Is there a way to do this without individually putting in all of the numbers? I have about 500 cells that need this to be done
Yes.
Here are two of perhaps several options.
Try this —
data = readtable('ExampleExcel.xlsx', VariableNamingRule='preserve')
data = 9x2 table
Lot Number RC __________ _________ {'A12345'} {'2,3' } {'A12345'} {'4,5' } {'A12345'} {'5,6' } {'A12345'} {'6,7' } {'A12345'} {'7,8' } {'A12345'} {'8,9' } {'A12345'} {'9,10' } {'A12345'} {'10,11'} {'A12345'} {'11,12'}
data.RC = cellfun(@(x)str2double(regexp(x,'\d*','match')), data.RC, Unif=0) % Use ‘regexp’
data = 9x2 table
Lot Number RC __________ _________ {'A12345'} {[ 2 3]} {'A12345'} {[ 4 5]} {'A12345'} {[ 5 6]} {'A12345'} {[ 6 7]} {'A12345'} {[ 7 8]} {'A12345'} {[ 8 9]} {'A12345'} {[ 9 10]} {'A12345'} {[10 11]} {'A12345'} {[11 12]}
data = readtable('ExampleExcel.xlsx', VariableNamingRule='preserve');
data.RC = cellfun(@(x)str2double(strsplit(x,',')), data.RC, Unif=0) % Use ‘strsplit’
data = 9x2 table
Lot Number RC __________ _________ {'A12345'} {[ 2 3]} {'A12345'} {[ 4 5]} {'A12345'} {[ 5 6]} {'A12345'} {[ 6 7]} {'A12345'} {[ 7 8]} {'A12345'} {[ 8 9]} {'A12345'} {[ 9 10]} {'A12345'} {[10 11]} {'A12345'} {[11 12]}
data.RCmtx = cell2mat(data.RC)
data = 9x3 table
Lot Number RC RCmtx __________ _________ ________ {'A12345'} {[ 2 3]} 2 3 {'A12345'} {[ 4 5]} 4 5 {'A12345'} {[ 5 6]} 5 6 {'A12345'} {[ 6 7]} 6 7 {'A12345'} {[ 7 8]} 7 8 {'A12345'} {[ 8 9]} 8 9 {'A12345'} {[ 9 10]} 9 10 {'A12345'} {[10 11]} 10 11 {'A12345'} {[11 12]} 11 12
These both work.
EDIT — Added ‘data.RCmtx’.
.
col1 = {'123', '123.', '.123', '.', '', 'abc'}
col1 = 1x6 cell array
{'123'} {'123.'} {'.123'} {'.'} {0x0 char} {'abc'}
regexprep(col1, '(\d*\.\d*|\d*)', '|$1|')
ans = 1x6 cell array
{'|123|'} {'|123.|'} {'|.123|'} {'|.|'} {0x0 char} {'abc'}
We can see from this test that '\d*\.\d*|\d*' matches a bare . by itself, but does not (without further options) match the empty string.
To get it right you need to use a more complicated pattern, such as
regexprep(col1, '(\d+(\.\d*)?|\.\d+)', '|$1|')
ans = 1x6 cell array
{'|123|'} {'|123.|'} {'|.123|'} {'.'} {0x0 char} {'abc'}
@Walter Roberson — I’d not considered the isolated decimal point, although it certainly could be a possibility. Thank you yet again.
As it turns out, apparently they’re all integers, so here strsplit may be all that’s necessary.

Sign in to comment.

More Answers (2)

Using SSCANF will likely be more efficient than relying on regular expressions:
T = readtable('ExampleExcel.xlsx')
Warning: Column headers from the file were modified to make them valid MATLAB identifiers before creating variable names for the table. The original column headers are saved in the VariableDescriptions property.
Set 'VariableNamingRule' to 'preserve' to use the original column headers as table variable names.
T = 9x2 table
LotNumber RC __________ _________ {'A12345'} {'2,3' } {'A12345'} {'4,5' } {'A12345'} {'5,6' } {'A12345'} {'6,7' } {'A12345'} {'7,8' } {'A12345'} {'8,9' } {'A12345'} {'9,10' } {'A12345'} {'10,11'} {'A12345'} {'11,12'}
F = @(t)sscanf(t,'%f,',[1,Inf]);
T.RC = cellfun(F, T.RC, 'uni',0)
T = 9x2 table
LotNumber RC __________ _________ {'A12345'} {[ 2 3]} {'A12345'} {[ 4 5]} {'A12345'} {[ 5 6]} {'A12345'} {[ 6 7]} {'A12345'} {[ 7 8]} {'A12345'} {[ 8 9]} {'A12345'} {[ 9 10]} {'A12345'} {[10 11]} {'A12345'} {[11 12]}
Try using str2num instead
str2num('2,3',Evaluation = "restricted")
ans = 1×2
2 3
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
or possibly this
s = '2,3';
str2double(split(s, ','))
ans = 2×1
2 3
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>

1 Comment

my problem is I have to do this with a whole column worth of data! 2,3 was just one example, but I have about 500 more of those to do lol

Sign in to comment.

Asked:

KD
on 19 Mar 2025

Answered:

on 19 Mar 2025

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!