Clear Filters
Clear Filters

Write specific data of the specific lines of a text file into a matrix

2 views (last 30 days)
Hello-- I have a HUGE text file with the following format :
*********************************
timestep 225645
A 8
B 43
C 4
D 1
*********************************
timestep 225650
A 10
D 12
C 1
*********************************
What I want is to write the number in front of the timestep in a the first column of a matrix. Also For each loop I want to export the value in front of B to the second column of that matrix. And if there is no B reported in some of the loops take 0 for those elements. I hope you might be able to help me. Thanks.

Answers (2)

Azzi Abdelmalek
Azzi Abdelmalek on 11 Jun 2015
fid=fopen('fic.txt');
l=fgetl(fid);
k=1;
while ischar(l)
r{k}=l;
k=k+1;
l=fgetl(fid);
end
fclose(fid);
idx=find(~cellfun(@isempty,regexp(r,'(?=timestep).+')));
a=regexp(r(idx),'\d+','match');
b=str2double([a{:}]);
ii=diff([idx numel(r)+1])-1;
for k=1:numel(b);
s=r(idx(k)+1:ii(k));
jj=find(~cellfun(@isempty,regexp(s,'(?=B).+')));
c=regexp(s(jj),'\d+','match');
if isempty(c)
f(k)=0;
else
f(k)=str2double(c{1});
end
end
M=[b' f']
  1 Comment
Homayoon
Homayoon on 11 Jun 2015
Edited: Homayoon on 11 Jun 2015
Dear Azzi, I do appreciate your helps! I was really stuck with this issue until you provided me with the code! However, it seems the code is not working in an appropriate way and that might be because of some ambiguities existed in my question. Up to now, the code is perfectly generating the first column of the matrix but for the second column it always gives 0! To clear up the issue a sample of my input text file has been attached! In fact the second column that I am interested in is the value in front of H2O. In order to discern between H2O and H2O2, I have to put an extra space after H2O to prevent any wrongdoings! I will appreciate your helps as before. Thanks for being so nice! PS: In line 16 of the code you had given to me I changed B to HO2 but it did not work. Always second column is zero, no matter what B is!

Sign in to comment.


Stephen23
Stephen23 on 11 Jun 2015
Edited: Stephen23 on 11 Jun 2015
This code reads the whole file as one string, then performing some string replacement operations to allow textscan to convert all of the values:
str = fileread('attached.txt');
str = regexprep(str,{'(\\par)?\s*\n','[*]{5,}'},{' ','\n'});
fmt = repmat('%s%f',1,9); % 9 == nine lines of 'key value'
C = textscan(str,['timestep',fmt(3:end)], 'HeaderLines',1, 'MultipleDelimsAsOne',true);
N = [C{3:2:end}]; % numeric values
S = [C{2:2:end}]; % string keys
T = C{1}; % numeric timesteps
Actually all of the data is now available in the variables N, S, and T. But if you want the columns of N to each contain just one variable, then the rows need to be sorted according to S, which can be done using this code:
X = cellfun('isempty',S);
U = unique(S(~X));
for k = 1:numel(T)
S(k,X(k,:)) = setdiff(U,S(k,:)); % insert missing keys
[S(k,:),Y] = sort(S(k,:)); % sort keys
N(k,:) = N(k,Y); % sort values
end
And we can view the output in the command window:
>> S
S =
'H' 'H2' 'H2O' 'H2O2' 'HO' 'HO2' 'No_Specs' 'O2'
'H' 'H2' 'H2O' 'H2O2' 'HO' 'HO2' 'No_Specs' 'O2'
'H' 'H2' 'H2O' 'H2O2' 'HO' 'HO2' 'No_Specs' 'O2'
'H' 'H2' 'H2O' 'H2O2' 'HO' 'HO2' 'No_Specs' 'O2'
>> N
N =
NaN 24 21 1 7 1 6 34
NaN 24 21 1 7 1 6 34
1 24 20 1 8 1 7 34
1 24 20 1 8 1 7 34
>> T
T =
525305
525310
525315
525320
Note that the order of the columns is alphabetical (after the sort), and the missing values are indicated with NaN's.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!