reading content of a text file by readtable

32 views (last 30 days)
Hello
I have this file exported by another software so I cannot make an Excel of it. I used readtable command to read numeric content below different variables. But it reads the second variables like BSTEN, BPR, BDENO AND BDENG but not the variables of the first row like FOE. How can I get all the columns of it?
I appreciate it in advance
fid=fopen('FAR.txt','w');
copyfile FAR.RSM FAR.txt
fclose(fid);
fid = fopen('FAR.txt');
opts = detectImportOptions('FAR.txt');
T=readtable('FAR.txt',opts);
T.FOE(3:end,1)
  7 Comments
Walter Roberson
Walter Roberson on 26 Mar 2019
Do you need all of the variables, or just the first block?

Sign in to comment.

Accepted Answer

dpb
dpb on 27 Mar 2019
Edited: dpb on 27 Mar 2019
Given the long conversation after the first rudimentary "throwaway" solution, I'll post a more general and robust solution as separate answer...
function t=readRunFar(file)
% return table containing data from input file
fid=fopen(file,'r');
% read the fullfile as cell array, convert to string array, ignore the FORTRAN formfeed
c=textscan(fid,'%s','delimiter','\n','Whitespace','','CommentStyle','1','CollectOutput',0);
fid=fclose(fid);
c=string(c{:});
ix=find(contains(c,"SUMMARY"))-1; % get section starting lines
N=numel(ix); % the number of sections in file
iyr=find(contains(c,"YEARS")); % the start of second column of data
iyr=strfind(c(iyr(1)),"YEARS")-1;
c(ix(2):end)=extractAfter(c(ix(2):end),iyr); % remove the TIME column after first set
iend=find(char(c(end))~=' ',1,'last'); % find last data on last section
d=c(1:ix(2)-1); % begin a new data array with first section
for i=2:N-1 % join subsequent sections to first
d=strcat(d, c(ix(i):ix(i+1)-1));
end
d=strcat(d, extractBefore(c(ix(N):end),iend+1));
d=d(4:end); % trash the beginning header lines
d=d(~all(char(d)==' '|char(d)=='-' ,2));% and the other extraneous text
data=str2num(char(d(4:end))); % convert to numeric
vnames=split(d(1)); % get the variable names from header
vnames=vnames(~(vnames=="")); % split() leaves empties...
vnames=categorical(vnames,unique(vnames,'stable')); % create ordered categorical variable
cats=categories(vnames); % the category names for logical addressing
t=table(data(:,1),'VariableNames',cats(1)); % initial table entry
for i=2:numel(cats) % and build the output table
t=[t table(data(:,vnames==cats(i)),'VariableNames',cats(i))];
end
end
The above reads the whole file into memory and mungs on it to rearrange it into one section by catenating the subsequent sections to the right of the first; snipping the TIME column from subsequent sections. It then parses the header line and creates a table--I ended up using a categorical variable and its facility to be used as logical addressing look to find associated columns instead of the alternative binning technique--that actually worked quite nicely! :)
This returns each duplicated column into the table with the generic name as an array of M columns; it is left as "Exercise for Student" to parse the units line and the node IDs to go with the variables.
I also attached the m file to be a little more convenient given its size
  3 Comments
ali mohebi
ali mohebi on 27 Mar 2019
i cant say anything. just thank youuuu. it worths too much.
dpb
dpb on 27 Mar 2019
'Tis OK...
If it does solve the problem, go ahead and ACCEPT an Answer so others won't keep coming back thinking they might just answer but hadn't had time...
I would suggest you study what I did to see how worked through the issues as learning experience.
None is all THAT difficult but does, granted, use a fair amount of past experience to know what features of Matlab are the ones for a given purpose--and, of course, it doesn't hurt that have been writing code for over 50 years now...which is scary to write! :)
It dawned on me an easy way to extract the units data so I've attached an updated version that incorporates them -- they only show up when you query the properties directly or use the summary function so they don't have the usefulness they might--don't know why TMW doesn't echo them at the command line if they're not empty strings--one more line of output doesn't seem anything to fret over and if the user went to the trouble to set them, seems rude to not see them easily--but that's just me.
The coordinates triplets could be retrieved as cell strings to display by the same logic using the next line in the string array. But, they're not so easily associated to the variable by having used the arrays for the common names; there's no way to assign them to the specific column.
Alternatively, one could back up a little and generate a distinct variable name for the repeated columns like those that are echo'ed by the summary function. All depends on what is really needed in the end.

Sign in to comment.

More Answers (1)

dpb
dpb on 25 Mar 2019
function t=readRunFar(file)
% return table containing data from input file
fid=fopen(file,'r');
% read the two sections -- first sucks up the formfeed as data
fmt1=repmat('%f',1,10);
d1=cell2mat(textscan(fid,fmt1,'headerlines',9,'collectoutput',1));
d1=d1(1:end-1,:); % get rid of the line with only the formfeed
fmt2=repmat('%f',1,5);
d2=cell2mat(textscan(fid,fmt2,'headerlines',8,'collectoutput',1));
d2=d2(:,2:end); % and the duplicate time column ditto
frewind(fid) % go get the header lines and units
h1=textscan(fid,'%s',10,'headerlines',4); % first section
h2=textscan(fid,'%s',5,'headerlines',4+size(d1,1)+5); % second
fid=fclose(fid);
t=array2table([d1 d2]);
t.Properties.VariableNames=[h1{:};h2{:}(2:end)];
end
for your file returns...
>> readRunFar('far.txt')
ans =
14×14 table
TIME YEARS FOE FGOR FPR FOIP FOPR FGIT BFLOGK BVGAS BSTEN BPR BDENO BDENG
____ ________ ________ ______ ______ __________ ____ ____ _______ ________ _______ ______ ______ ______
1 0.002738 0.00014 1.3567 3977.7 7.1239e+05 100 0 0.42135 0 0 3837.9 37.995 0
3 0.008214 0.000421 1.334 3974.5 7.1219e+05 100 0 0.53486 0.028387 0.66255 3768.1 38.147 14.706
6.5 0.017796 0.000912 1.3411 3968.7 7.1184e+05 100 0 0.6265 0.028 0.7114 3713.8 38.305 14.496
10 0.027379 0.001403 1.3484 3962.8 7.1149e+05 100 0 0.67827 0.027769 0.74217 3681 38.399 14.369
15 0.041068 0.002104 1.3496 3954.4 7.1099e+05 100 0 0.72239 0.027563 0.77053 3651.6 38.483 14.256
20 0.054757 0.002806 1.3465 3946.2 7.1049e+05 100 0 0.78213 0.027403 0.79331 3628.4 38.548 14.167
30 0.082136 0.004209 1.3398 3930.2 7.0949e+05 100 0 0.85478 0.027056 0.84474 3578.3 38.689 13.973
40 0.10951 0.005611 1.3357 3914.8 7.085e+05 100 0 0.90514 0.026729 0.89594 3529.4 38.822 13.789
50 0.13689 0.007015 1.3316 3900.1 7.075e+05 100 0 0.93321 0.02645 0.94197 3487.8 38.935 13.63
60 0.16427 0.008417 1.327 3886.1 7.065e+05 100 0 0.95788 0.026225 0.98041 3452.9 39.024 13.501
70 0.19165 0.00982 1.3226 3872.6 7.055e+05 100 0 0.95869 0.026027 1.0157 3422.7 39.103 13.387
80 0.21903 0.011223 1.3181 3859.9 7.045e+05 100 0 0.96569 0.025863 1.0455 3396.8 39.167 13.292
90 0.24641 0.012626 1.3141 3848 7.035e+05 100 0 0.98006 0.025717 1.0727 3374.2 39.225 13.207
100 0.27379 0.014029 1.3105 3836.8 7.025e+05 100 0 0.99436 0.025584 1.0982 3353.1 39.277 13.13
>>
  23 Comments
ali mohebi
ali mohebi on 27 Mar 2019
  1. The variables in new attached file are all possible variables.
  2. There may be another list of variables containing other variables too which not mentioned here, but it doesn’t need here.
  3. Yes, the time is specified in the input file. It is like this; the software is asked to make an output file from zero to any time you want. That steps taken for time u see in the output file is taken by software and we can’t say it will be regular.
dpb
dpb on 27 Mar 2019
OK...I've got a few minutes, think I can get you a starting point to finish up with in reasonably short order.

Sign in to comment.

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!