Reading a text file with attached string data

Hello everybody,
I have a trouble in reading the text file with some attached data. Every value is a 5-digital integer. So I used textscan to get the value:
fid = fopen('text.txt','r');
priformat1='%5d %5d %5d %5d %5d %5d %5d %5d %5d %5d %5d %5d %5d %5d';
% priformat1='%5s %5s %5s %5s %5s %5s %5s %5s %5s %5s %5s %5s %5s %5s'; I also try
st = fgetl(fid);
if strfind(st,'NODES')
st=fgetl(fid);
F = cell2mat(textscan(fid,priformat1));
F = double(F)
else
error('no NODES found !');
end
%
st = fgetl(fid);
if ~isempty(findstr(st,'ELEMT'))
G = cell2mat(textscan(fid,priformat1));
G = abs(double(G));
else
error('Oops...*.PRI file problem - no ELEMT found !');
end
fclose(fid);
However the value I obtained is not correct when the value is attached. For example: in the fourth line of NODES after 9518: 9825101076....is not recognized well by textscan. sometimes the data like ' 10110010', if I used 5%d as parameters, the output will be 10110, 10. But I need is 101,10010 I tried a lot input parameters such as whitespace, %5c, and textread.... I was not managed to read such text.
Could you help me for this problem? thank you very much.
-----------the following is the txt file------------------
NODES
NODES
-3 666 1595 1693 1911 2154 2434 2464 2578 2845 3276 3420 3752 3837
-3 4324 4329 4624 4924 5194 5498 5715 5737 5832 6048 6083 6261 6579
-3 6647 6673 6772 6806 7085 7292 7597 7694 7803 8111 8135 8329 8640
-3 8809 8891 9205 9518 98251017610195102671039810508106541082411062
-311116112111133511355115981162411702118431193912121121641219512406
-312446125081251212526127271277012964129691308813135132901370913910
-4 666 1595 1693 1911 2154 2434 2464 2578 2845 3276 3420 3752 3837
-4 4324 4329 4624 4924 5194 5498 5715 5737 5832 6048 6083 6261 6579
-4 6647 6673 6772 6806 7085 7292 7597 7694 7803 8111 8135 8329 8640
ELEMT
1 462 468 687 943 947 1135 1138 1141 1144 1351 1591 1596 1598
1 2350 2356 2522 2711 2715 2860 2863 2866 3020 3197 3202 3204 4786
1 4792 5403 6107 6111 6630 6633 6636 6639 6642 6645 6648 6651 6654
1 6657 6660 7229 7889 7894 7896 9728 97341009010497105011078510788
110791107941079710800111341151711522115241218612188121901219212240
112242122441236812370123721237412376123781238012382123841238612388
2 462 468 687 943 947 1135 1138 1141 1144 1351 1591 1596 1598
2 2350 2356 2522 2711 2715 2860 2863 2866 3020 3197 3202 3204 4786
REACT
--------------------the end of txt file-------(p.s. the number of line is not fixed)
--------------The matrix I need is:------
NODES=[
-3 666 1595 1693 1911 2154 2434 2464 2578 2845 3276 3420 3752 3837
-3 4324 4329 4624 4924 5194 5498 5715 5737 5832 6048 6083 6261 6579
-3 6647 6673 6772 6806 7085 7292 7597 7694 7803 8111 8135 8329 8640
-3 8809 8891 9205 9518 9825 10176 10195 10267 10398 10508 10654 10824 11062
-3 11116 11211 11335 11355 11598 11624 11702 11843 11939 12121 12164 12195 12406
-3 12446 12508 12512 12526 12727 12770 12964 12969 13088 13135 13290 13709 13910
-4 666 1595 1693 1911 2154 2434 2464 2578 2845 3276 3420 3752 3837
-4 4324 4329 4624 4924 5194 5498 5715 5737 5832 6048 6083 6261 6579
-4 6647 6673 6772 6806 7085 7292 7597 7694 7803 8111 8135 8329 8640
]
and
ELEMT=[
1 462 468 687 943 947 1135 1138 1141 1144 1351 1591 1596 1598
1 2350 2356 2522 2711 2715 2860 2863 2866 3020 3197 3202 3204 4786
1 4792 5403 6107 6111 6630 6633 6636 6639 6642 6645 6648 6651 6654
1 6657 6660 7229 7889 7894 7896 9728 9734 10090 10497 10501 10785 10788
1 10791 10794 10797 10800 11134 11517 11522 11524 12186 12188 12190 12192 12240
1 12242 12244 12368 12370 12372 12374 12376 12378 12380 12382 12384 12386 12388
2 462 468 687 943 947 1135 1138 1141 1144 1351 1591 1596 1598
2 2350 2356 2522 2711 2715 2860 2863 2866 3020 3197 3202 3204 4786
]
REACT=[]

 Accepted Answer

You cannot read a fixed width file with a numeric specifier, e.g. %5d, because it will end reading the first -3 as -3111 which has 5 "positions" as indicated by %5d.
You can read in fixed columns as %5c, an example:
str = ' -311116112111133511355115981162411702118431193912121121641219512406';
out = textscan(str,repmat('%5c',1,14),'Whitespace','');
I suggest to do:
fid = fopen('test.txt','r');
if strfind(fgetl(fid),'NODES')
% Read in as char X by 70
tmp = textscan(fid,'%70c','Whitespace','');
sz = size(tmp{1});
% Fundamental reshape and add a white space
st = char(' ', reshape(tmp{1}.',5,14*sz(1)));
out = textscan(st,'%5f','CollectOutput',1);
out = reshape(out{1},14,sz(1)).';
else
error('no NODES found !');
end
EDIT
% Read in as cellstring to identify the headlines
fid = fopen('test.txt','r');
tmp = textscan(fid,'%s','Delimiter','','Whitespace','');
fclose(fid);
% Find start of NODES and ELEMT
sten = find(strcmpi('nodes',tmp{1}) | ...
strcmpi('elemt',tmp{1}));
% Extract nodes
nodes = char(tmp{1}(sten(1)+1:sten(2)-1));
if ~isempty(nodes)
sz = size(nodes);
% Reshape and add a white space
nodes = char(' ', reshape(nodes.',5,14*sz(1)));
nodes = textscan(nodes,'%5f','CollectOutput',1);
nodes = reshape(nodes{1},14,sz(1)).';
end
% Extract nodes
elemt = char(tmp{1}(sten(2)+1:end-1));
if ~isempty(elemt)
sz = size(elemt);
% Reshape and add a white space
elemt = char(' ', reshape(elemt.',5,14*sz(1)));
elemt = textscan(elemt,'%5f','CollectOutput',1);
elemt = reshape(elemt{1},14,sz(1)).';
end
EDIT 2
fmt = repmat('%5f',1,14);
opt = {'EmptyValue',0,'CollectOutput',1};
% Extract nodes
nodes = tmp{1}(sten(1)+1:sten(2)-1);
if ~isempty(nodes)
nodes = cellfun(@(x) textscan(char(' ',reshape(x.',5,[])), fmt, opt{:}), nodes);
nodes = cat(1,nodes{:});
end
% Extract elemt
elemt = tmp{1}(sten(2)+1:end-1);
if ~isempty(elemt)
elemt = cellfun(@(x) textscan(char(' ',reshape(x.',5,[])), fmt, opt{:}), elemt);
elemt = cat(1,elemt{:});
end

9 Comments

Thank you for helping me these days.
However, when using tmp = textscan(fid,'%70c','Whitespace',''); all the text will be read including ELEMT and its data. Therefore, the program can not continue to the end. I have tried to use white-space to each line.
But the problem is I don't know how many lines I have.
What do you think?
Ok, I thought for some reason that it was on separate files, will come up with a solution asap.
Hello, it gave me a error message : ???
Error using ==> reshape
To RESHAPE the number of elements must not change.
I thought this is due to the number of data in the original text file is not always 14 in each line. In the last line, it may be less than 14 data. Sometime it will be 1 or 2 or 3 or etc.
Could we do the reshape in other way such as cell2mat ? I have tried it but it doesn't work. Or maybe I should try str2num ?
we can add 0 to the missing value. But I don't know which function can do this. Could you give me some idea about it?
many thanks
priformat1='%5f %5f %5f %5f %5f %5f %5f %5f %5f %5f %5f %5f %5f %5f';
% Read in as cellstring to identify the headlines
tmp = textscan(fid,'%s','Delimiter','','Whitespace','');
fclose(fid);
% Find start of NODES and ELEMT
sten = find(strcmpi('nodes',tmp{1}) | ...
strcmpi('elemt',tmp{1}));
% Extract nodes
nodes = char(tmp{1}(sten(1)+1:sten(2)-1));
if ~isempty(nodes)
sz = size(nodes);
% Reshape and add a white space
nodes = char(' ', reshape(nodes.',5,14*sz(1)));
XXXX = textscan(nodes,priformat1)
nodes = zeros(length(XXXX{1}),14)
for i=1:14
nodes(1:length(XXXX{i}),i)=XXXX{i}
end
% nodes = textscan(nodes,'%5f','CollectOutput',1);
% nodes = reshape(nodes{1},14,sz(1)).';
end
How do you think about it ? it can work in anyway. I am still thinking for the case that there is only one line with less than 14 values. How I can generate the code for both cases:
only one line with less or equal to 14 data, multi-lines in which the last line has less or equal to 14 data
See second edit, assumptions:
Maximum number of fields 14 (you can change it in fmt)
It works very well. Thank you very much..
Hi, I have posted a question. The disscusion is on the way. Maybe you will be interested in it too..
http://www.mathworks.com/matlabcentral/answers/14484-reading-a-very-large-text-file-of-an-almost-regular-data-with-empty-value

Sign in to comment.

More Answers (0)

Categories

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!