How can I read csv file data correctly? I tried multiple ways
    4 views (last 30 days)
  
       Show older comments
    
Hi 
I have a csv file and I am trying to import it to matlab. an example for the file content is the following (this is one row)
689d-40bf-9c61-551c0c1a69bf,true,"timestamp","  2","  8","  2",25.21,17.536593101926,39,62
I tried using 
readtable()
but the file is not separated as it is by the commas. 
Then, I tried 
csvread()
and I get the error: 
Error using dlmread (line 147)
Mismatch between file and format character vector.
Trouble reading 'Numeric' field from file (row number 1, field number 1) ==>
Also, I tried 
textscan()
and it did not do anything(no error but no content extracted either)
Last thing, I tried to manually import the data using the interface but the same problem in the readtable() occurred.
How can I read the data correctly? and put it in a matrix or table
Thank you
0 Comments
Accepted Answer
  Stephen23
      
      
 on 2 Nov 2020
        
      Edited: Stephen23
      
      
 on 2 Nov 2020
  
      textscan has no problems importing the file data simply and efficiently (sample file is attached):
opt = {'Delimiter',',','CollectOutput',true};
fmt = '%s%s%q%q%q%q%f%f%f%f';
[fid,msg] = fopen('temp0.txt','rt');
assert(fid>=3,msg)
out = textscan(fid,fmt,opt{:});
fclose(fid);
Giving:
>> out{1} % character data
ans = 
    [1x27 char]    'true'    'timestamp'    '  2'    '  8'    '  2'
    [1x27 char]    'true'    'timestamp'    '  3'    '  9'    '  1'
    [1x27 char]    'true'    'timestamp'    '  4'    ' 10'    '  0'
    [1x27 char]    'true'    'timestamp'    '  5'    ' 11'    ' -1'
>> out{2} % numeric data
ans =
   25.2100   17.5366   39.0000   62.0000
   25.2300   17.5366   40.0000   63.0000
   25.2400   17.5366   41.0000   64.0000
   25.2500   17.5366   42.0000   65.0000
>>
Because readtable also supports the format specifier I see no reason why it shouldn't work as well. I might try later.
2 Comments
  Stephen23
      
      
 on 3 Nov 2020
				
      Edited: Stephen23
      
      
 on 3 Nov 2020
  
			"But I would like to ask, why check assert(fid>=3,msg)?"
To print an informative error message if the file could not be opened.
"I thought fid contains the data from the csv file"
No, it does not.
The command fopen opens a file and returns a kind of handle to the open file, that handle is known as a "file identifier" (this is explained in the fopen documentation). Then any functions and operators which need to operate on that file (e.g. reading data, writing data, moving the current position in the file, etc.) are given that file ID so that they can perform their operations on the open file.
In this case textscan takes the file ID of an open file and imports the file data using the options that we defined.
More Answers (1)
  Mathieu NOE
      
 on 2 Nov 2020
        hello 
seems matlab has an issue with the format of your data (especially with ) 
I could not make it work whatever the options with readtable.
I ended doing a small work around function with basic operations. 
Seems to work, at least on my matlab
input data : 4 lines - slightly different - saved as csv file 
689d-40bf-9c61-551c0c1a69bf,true,"timestamp","  2","  8","  2",25.21,17.536593101926,39,62
679d-40bf-9c61-551c0c1a69bf,true,"timestamp","  2","  8","  2",25.21,17.536593101926,39,63
669d-40bf-9c61-551c0c1a69bf,true,"timestamp","  2","  8","  2",25.21,17.536593101926,39,64
659d-40bf-9c61-551c0c1a69bf,true,"timestamp","  2","  8","  2",25.21,17.536593101926,39,65
function code as follows : 
function output_matrix = retrieve_csv(Filename)
fid = fopen(Filename);
tline = fgetl(fid);
k = 0;
while ischar(tline)
    k = k+1;    % loop over line index
    sep = findstr(tline,',');
	ind = [0;sep(:);length(tline)+1];
	for ci = 1:length(ind)-1
        tline_extract = tline(ind(ci)+1:ind(ci+1)-1);
        % remove undesired characters (")
        ind_rem = findstr(tline_extract,'"');
        tline_extract(ind_rem) = '';
        output_matrix{k,ci} = tline_extract;
	end
tline = fgetl(fid);
end
fclose(fid);
output : 
output_matrix = 
  Columns 1 through 7
    [1x27 char]    'true'    'timestamp'    '  2'    '  8'    '  2'    '25.21'
    [1x27 char]    'true'    'timestamp'    '  2'    '  8'    '  2'    '25.21'
    [1x27 char]    'true'    'timestamp'    '  2'    '  8'    '  2'    '25.21'
    [1x27 char]    'true'    'timestamp'    '  2'    '  8'    '  2'    '25.21'
  Columns 8 through 10
    '17.536593101926'    '39'    '62'
    '17.536593101926'    '39'    '63'
    '17.536593101926'    '39'    '64'
    '17.536593101926'    '39'    '65'
2 Comments
See Also
Categories
				Find more on Large Files and Big Data in Help Center and File Exchange
			
	Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!

