readtable is ignoring import options to get variable names
Show older comments
Am I being stupid or is this function not logical?
I want to import a csv file. There are 3 header lines. The actual variable names are on line 3. The units are on line 2. Line 1 is to be ignored.
So if I just opts = detectImportOptions and then set opts.VariableNamesLine=3, opts.VariableUnitsLine=2. It picks up the latter and ignores completely the former and just uses the original variablenames it picked up on Line 1.
If I detectImportOptions(file,'NumHeaderLines',1) it then picks up the units line as the names.
If I do it again and tell it to skip 2 lines, it picks the right names. I can then set opts.VariableUnitsLine to 2 and it does go back and pick the units correctly.
So I do get what I want in the end. But the function doesn't seem to work as expected? i.e. create the options and then modify the line options. Seems like whatever it initially picks up first as the names gets set in stone and you can't do anything about it (except the sorta hacky way I just worked out).
5 Comments
jonas
on 16 Aug 2018
attach the file
Alex Mason
on 16 Aug 2018
Adam Danz
on 16 Aug 2018
You could anonymize the data or create a working example that has the same structure but fake data that produces the same behavior as your current file. Providing the relevant code would also be helpful.
jonas
on 16 Aug 2018
I have not been able to solve it using readtable, but I was able to reproduce the problem easily using the attached textfile. So if anyone else wants to give it a try...
Accepted Answer
More Answers (2)
Jacob Hootman
on 8 Oct 2018
I had the same issue. I went through in debug several time; I believe this is a bug. Here is what I found:
Open TextImportOptions.m and go to line 211, it will read:
% Read Names
if opts.VariableNamesLine > 0 && rvn
names = readVariableNames(parser);
else
names = opts.SelectedVariableNames;
end
% Read Metadata
units = readVariableUnits(parser);
descr = readVariableDescriptions(parser);
The problem is that 'rvn' gets its value from a persistent variable, which means unless that parameter is specified on the first function call, it will always be false.
Change the &&, in the if statement, to 'OR' logic (read the 'NOTES' below, before doing so). Now the code will work as intended. This is what is should look like:
% Read Names
if opts.VariableNamesLine > 0 || rvn
names = readVariableNames(parser);
else
names = opts.SelectedVariableNames;
end
% Read Metadata
units = readVariableUnits(parser);
descr = readVariableDescriptions(parser);
Also, I'm not sure why the programmer decided to use an 'if else' statement to decide how to get the variable names, yet only calls a function to get the units and descriptions.
NOTES: (1) Making this change requires administrative access, (2) m file must be changed with a non matlab editor (ex: notepad++), (3) this change will only affect your local machine (i.e. other computers will have difficulties running if they do not have this change installed), (4) any updates that matlab installs may revert this code.
7 Comments
Adam Danz
on 9 Oct 2018
What is TextImportOptions.m? That's not a matlab file. Even detectImportOptions.m doesn't have the variable name opts.VariableNamesLine so I'm not sure what file you're working with.
Jacob Hootman
on 9 Oct 2018
Hmm, that's interesting. What version of matlab are you running? I'm on 9.3.0.713579 (R2017b).
TextImportOptions.m should exist in this folder:
C:\Program Files\MATLAB\R2017b\toolbox\shared\io\+matlab\+io\+text
The parent folders may be different depending on your setup.
jonas
on 9 Oct 2018
readtable seems to have had quite a few updates over the last couple of releases, or a major one at some point recently. I always run into trouble when helping my colleagues with imports, as they are missing several key features.
Guillaume
on 9 Oct 2018
detectImportOptions has been improved with every version since it's been introduced so I wouldn't expect the code to be similar from version to version. There's no TextImportOptions.m in R2018b, there's a getTextOpts.m instead which delegates the heavy lift to a built-in function (hence you can't see the actual detection code).
I see now, TextImportOptions is stored in a package directory which isn't allowed in the matlab path which is why it doesn't appear when I search for it using which() or similar methods (even in 2017b). It's a classdef m file. When you google " matlab TextImportOptions " there is nearly no information about this file.
Anyway, how did you end up in this classdef file? What function called it and how did you end up stepping through this file during debugging?
@Guillaume: That explains it. The fact that there are several different versions is unfortunate as it becomes difficult to write complex importopts for beginners on this forum. Many times people just reply with an error message, and therefore I usually opt for something more reliable such as textscan despite readtable usually being the more practical choice for semi-complex imports.
Sorry for interrupting your discussion, I will be on my way now :)
Jacob Hootman
on 28 Oct 2018
@Adam Danz I just kept stepping into every function that resulted in an error. I called the readtable function with arguments for both the fileName and the OPTS.
Juan Nicolás Ibáñez
on 23 Sep 2024
Edited: Juan Nicolás Ibáñez
on 23 Sep 2024
0 votes
The help for the function detectImportOptions() says
% "ReadVariableNames" - Whether or not to expect variable names in
% the file. Defaults to true.
However, for one large database, it did not got the variable names until I specified that as true in the command line, like this:
opts = detectImportOptions(path_filename,'NumHeaderLines',0,'ReadVariableNames',1)
Categories
Find more on Large Files and Big Data in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!