readtable not understood behaviour
1 view (last 30 days)
Show older comments
Hi all,
I have the following settings.ini file (its a basic text file):
;delay of calibration start (in seconds) after scanning the DUT serial number
intCalibrationStartDelay = 5
;directory path at which the log files are saved
strSaveFilePath = ''
;COM port of the power meter
strPowerMeterCOMPort = COM4
I would like to read it as a table, with each line of it becoming an element in my table, and for this, I use the following:
SettingsData = readtable('settings.ini',"FileType","text","Delimiter",'\r\n',"ReadVariableNames",false,"ReadRowNames",false,"TextType",'string');
For some reason, however, the first two lines of the file do not get imported, any idea why? The result of the instruciton above is always:
SettingsData =
6×1 table
Var1
__________________________________________________
""
";directory path at which the log files are saved"
"strSaveFilePath→→→= ''"
""
";COM port of the power meter"
"strPowerMeterCOMPort→→= COM4"
Best regards,
Cristian
0 Comments
Accepted Answer
dpb
on 6 Sep 2019
You could make a textimportoptions object and use it with readtable but with the text file as text, I'd just scan it and then convert...
fid=fopen('yourfile.ini');
t=textscan(fid,'%s','delimiter','\n');
fid=fclose(fid);
t=table(t{:});
readtable is designed specifically for tabular datafiles and works really, really hard to import everything to fit the model of a header line followed by tabular (mostly numeric) data. You have to fight against that builtin prejudice to overcome the internal evaluations done to try to interpret what the file structure is. textscan just reads in what it sees.
My $0.02 for personal preference for the particular case.
2 Comments
Guillaume
on 6 Sep 2019
You have to fight against that builtin prejudice
I wouldn't call it prejudice. The file here is simply not tabular data (in the meaning used by readtable). Rather, I would say that you can pervert readtable to succesfully decode the file (indeed with textimportoptions).
I agree that textscan would be more efficient here.
However, note that an ini file may have sections (not shown in the example given) which may complicate parsing. An ini file is a made of key-value pairs delimited into optional sections. Personally, I'd write a generic parser for ini file and store the result in a containers.Map
dpb
on 6 Sep 2019
"file here is simply not tabular data (in the meaning used by readtable)."
That's precisely the prejudice with which readtable was designed was speaking of... :)
Definition of prejudice
...
2a : preconceived judgment or opinion
readtable has a preconceived idea of what constitutes a table that comes from the design ideas implemented in the table class.
Granted, it has a general connotation of bad; I chose it here not out of hostility or the like but as simply editorial license...
More Answers (2)
Jeremy Hughes
on 6 Sep 2019
A simpler way to do this would be:
lines = splitlines(fileread(filename))
1 Comment
Walter Roberson
on 6 Sep 2019
Or if, like me, you have...older... habits,
lines = regexp( fileread(filename), '\r?\n', 'split');
Both splitlines and regexp 'split' used this way have an oddity of often returning an empty character vector or string element at the end. This occurs for files whose last line ends in newline, in which case regexp and splitlines both create an entry for the emptiness between the newline and the end of file. This is not wrong, but it can be unexpected.
For text files, none of the major operating systems strongly define whether newline is a line separator or a line terminator. If newline is a line terminator then the newline at the end of the last line would terminate that line and then there would be no following lines so you would not expect any empty string at that point. However, if newline is a line separator then the newline at the end of the last line is separating that line and the empty line up to end of file and it makes (some) sense for there to be an empty string at that point.
The philosophic difference shows up in the treatment of a last line that has no newline and just suddenly ends at end of file: is that a malformed line (if newline is the line terminator then it "should" have appeared there to terminate the line before end of file), or is it a valid line (if newline is a line separator then it is not needed to separate the end of the current line from end of file.)
Historically, Windows and Unix have both weakly defined newline as being a line separator -- that is, that it is considered entirely valid for a line to just suddenly end with end of file following. I say "weakly defined" because so many "standard" programs have gotten it wrong over the years, and even the fundamental C library routine fgets() implicitly requires that you test every time whether you actually received the trailing line terminators or not, which hardly anyone does...
Cristian Berceanu
on 9 Sep 2019
2 Comments
Guillaume
on 9 Sep 2019
Edited: Guillaume
on 9 Sep 2019
It gets imported fine for me (R2019a). However, readtable has all sorts of heuristics to try to determine what is actually data and what is header, so I wouldn't be surprised if it failed to import properly a fine that is inherently not tabular.
Possibly, there is something slightly special about yourt text file that was lost when you pasted the content in your question. It's always best to attach the file to the question so that text encoding, line endings, etc. are not lost.
dpb
on 9 Sep 2019
"am still a bit puyyled as to why the readtable function does not operate as expected "
As Guillaume and I both noted, readtable is designed specifically for tabular data and does its darndest to force whatever it sees into what it thinks a table should look like.
When you give it something that isn't really tabular, "all bets are off!" as to what it may try to make of it.
Here the empty rows at the beginning were such that those first lines were interpreted as non-data lines.
Run detectImportOptions on the file and you can inspect how your particular file was interpreted by the internal heuristics and, if you really, really feel compelled to use readtable directly, you can edit the returned importoptions object to specify explicitly the starting row and then use the modified object with readtable to force the desired interpretation.
I think it's still simpler to just scan and then convert, but if you're doing this on a lot of files you can save the options object for reuse.
See Also
Categories
Find more on Characters and Strings in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!