readtable can't get variable names from csv if different number of columns

88 views (last 30 days)
I have datafiles like this:
header,...,header
data,...,data,
data,...,data,
...
Notice the extra trailing comma on each line of data. If I add a trailing comma to the header line by hand, then run readtable(filename), it all works as expected, with all the data under the header names and an extra column of blanks under the name "ExtraVar1".
I can remove that extra column using opts.ExtraColumnsRule='ignore';.
What I haven't figured out is how to use opts.ExtraColumnsRule (or anything else) to read the header lines and use them without manually adding in the trailing comma.
If I try reading the file as, it gives me the super baffling behaviour of reading all the data under the names "Var1,Var2..." with an "ExtraVar1" at the end. I can remove the blank "ExtraVar1" column again using opts.ExtraColumnsRule='ignore';.
So, it recognizes that there's no header value for the last column, and has logic to call it ExtraVar, but for some reason this breaks the rest of the header-to-variable-name conversion.
It also does this all silently, no warning or error or indication of why it has not named the variables.
Is this all intended behaviour? What have I missed? Is the only way to read these in correctly for me to preprocess the files to fix the trailing commas?

Accepted Answer

Jeremy Hughes
Jeremy Hughes on 7 Mar 2018
Try giving a hint.
opts = detectImportOptions(filename,'NumHeaderLines',0);
T = readtable(filename,opts)

More Answers (0)

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!