How to read specific data from text file between 2 lines
Show older comments
Hello,
I have the attached text file for which I would like to accomplish the following: The data is formatted as seen below and I would like to extract and plot only the numbers between the SP_0 through the SP_19 tag

I have tried looping through the file and using the cell2mat function once the "SETPOINT" tag is found but am not having any luck. Any help would be greatly appreciated!
while ~feof(fid)
lineData = fgetl(fid); % read a line
if strfind(lineData,'SETPOINT'), break, end % found the first 'SETPOINT' so quit
end
data=cell2mat(textscan(fid,repmat('%d',1,1),'collectoutput',1));
Answers (1)
The proper way to do this: Your file is not a text file but an xml file. Use xmlread and navigate the DOM or the FileExchange xml2struct if navigating the DOM is too complicated. The code would be something like this:
xmltree = xml2struct('pathtothefile');
setpoints = xmltree.RECIPE.SETPOINTS;
desiredsetpoints = arrayfun(@(n) str2double(setpoints.(sprintf('SP_%d', n)).Text), 0:19);
The cheap way to do it is to use a regular expression to extract the setpoints. It'll be faster but can break in all sort of interesting ways if something else in the file happens to match the regex.
filecontent = fileread('pathtothefile');
desiredsetpoints = str2double(regexp(filecontent, '(?<=<SP_1?[0-9]>)\d+', 'match'))
The regexp also doesn't check that the setpoints are in the right order. The order of the tags in an XML file is absolutely not guaranteed, so use at your own risks.
5 Comments
Michael Lopez
on 12 Oct 2018
Michael Lopez
on 12 Oct 2018
Edited: Michael Lopez
on 12 Oct 2018
Guillaume
on 12 Oct 2018
Yes, forgot to look at the text of the tag in the xml2struct version. Fixed now.
Or if I use the regexp method, how can I specify to only read in the values between the recipe tags?
To do it sort of safely, you'd have to do it in two step, one regexp to extract the content of the recipe tag and another one to parse that content. Regexes are not recommended for parsing xml/html content. It's too easy to break them or they become very complicated if you want them foolproof.
Using a parser designed for the format is a lot safer, so I would really recommend you use the first option.
Michael Lopez
on 12 Oct 2018
Michael Lopez
on 13 Oct 2018
Categories
Find more on Tables in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!