Importing large data set in multiple formats including scientific notation

5 views (last 30 days)
Trying to import a large data-set in CSV format. I need the code to read normal numeric format, scientific notation, and time. See file, "test.csv" below:
"Time","RG","RP"
"sec","mmHg","mmHg","Date","Time"
.000, 1.64795E+01, 1.22070E+00,04-19-18,22:13:00
.010, 1.72119E+01, 1.22070E+00,04-19-18,22:13:00
.020, 1.79443E+01, 1.22070E+00,04-19-18,22:13:00
.030, 1.87988E+01, 1.22070E+00,04-19-18,22:13:00
.000, 1.64795E+01, 1.22070E+00,04-19-18,22:13:00
.010, 1.72119E+01, 1.22070E+00,04-19-18,22:13:00
I've tried using csvread after stripping the 2 header rows:
m = csvread("test.csv");
and I get an error:
Error using dlmread (line 147)
Mismatch between file and format character vector.
Trouble reading 'Numeric' field from file (row number 2, field
number 3) ==> :13:00\n
Error in csvread (line 48)
m=dlmread(filename, ',', r, c);
I've tried using textscan (from another posting):
fid = fopen('test.csv','r');
m = textscan(fid,'%f%f','HeaderLines',2);
fclose(fid);
And I got a bogus 1x2 cell
m =
1×2 cell array
{[0]} {0×1 double}
Any solution would need to handle 3-4 million rows of data. Thank you!

Accepted Answer

Star Strider
Star Strider on 21 Apr 2018

You need to change the format string.

I would do something like this:

   m = textscan(fid,'%f%f%f%s%s','HeaderLines',2, 'Delimiter',',');

If you only want to read the first 2 fields and ignore the rest, add ‘*’ to the format string elements you want to ignore:

m = textscan(fid,'%f%f%*f%*s%*s','HeaderLines',2, 'Delimiter',',');

There are other options to read the dates and times. See the documentation section on formatSpec (link) for details.

  3 Comments

Sign in to comment.

More Answers (0)

Categories

Find more on Large Files and Big Data in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!