Reading text file data with different data formats

30 views (last 30 days)
Hi,
I am trying to read exported text data arranged in columns. Each column has different data format. The text file doesnt start with the header instead header comes in between lines after regular interval. I am attaching also a sample file. I would like to read this data in matlab in same order. I have tried several examples with different fucntions available on matlab forums but havent succeeded yet. Can someone please help me in sorting this data? Thank you

Accepted Answer

Rik
Rik on 19 Jan 2022
I would suggest reading the text file as text, extract the header, and then parse the data with tools like textscan.
(You can get my readfile function from the FEX. If you are using R2017a or later, you can also get it through the AddOn-manager.)
URL='https://www.mathworks.com/matlabcentral/answers/uploaded_files/867210/sample%20test%20data.txt';
try data=cellstr(readlines(URL));
catch, data=readfile(URL);
end
%make use of the fact that this is solid block of text
data=cell2mat(data);
loc=find(data(:,1)=='#');
header=data(loc,:)
header = '# SeqNum ChN Pri Srv MC PowerA PowerB NoiseA NoiseB Time Stamp (s) Dest MAC Addr Src MAC Addr EthTyp Mat Errs Leng '
data(loc,:)=[]
data = 39×137 char array
'1611574502 184 000 001 10 -73.0 -80.0 -106.0 -104.0 00000000006346.226848 33:33:00:00:00:fb ea:99:0e:0a:53:f7 0x86dd Yes 0086 0097 ' '1170211136 184 006 001 10 -73.0 -75.0 -103.0 -104.0 00000000006346.590584 ff:ff:ff:ff:ff:ff ea:99:0e:0a:53:f7 0x0800 Yes 0085 0324 ' '0000000017 184 004 001 11 -99.0 -98.0 -103.0 -107.0 00000000006361.062418 ff:ff:ff:ff:ff:ff 04:e5:48:00:10:00 0x0000 Yes 0000 0104 ' '1611574502 184 000 001 10 -68.0 -67.0 -103.0 -107.0 00000000006362.239553 33:33:00:00:00:fb ea:99:0e:0a:53:f7 0x86dd Yes 0086 0097 ' '0000000019 184 004 001 11 -101.0 -98.0 -103.0 -104.0 00000000006363.062618 ff:ff:ff:ff:ff:ff 04:e5:48:00:10:00 0x0000 Yes 0000 0104 ' '1170211136 184 006 001 10 -67.0 -66.0 -106.0 -104.0 00000000006363.169210 ff:ff:ff:ff:ff:ff ea:99:0e:0a:53:f7 0x0800 Yes 0085 0324 ' '0000000020 184 004 001 11 -99.0 -96.0 -103.0 -104.0 00000000006364.062714 ff:ff:ff:ff:ff:ff 04:e5:48:00:10:00 0x0000 Yes 0000 0104 ' '0000000021 184 004 001 11 -99.0 -96.0 -103.0 -104.0 00000000006365.062836 ff:ff:ff:ff:ff:ff 04:e5:48:00:10:00 0x0000 Yes 0000 0104 ' '0000000022 184 004 001 11 -99.0 -98.0 -106.0 -104.0 00000000006366.062957 ff:ff:ff:ff:ff:ff 04:e5:48:00:10:00 0x0000 Yes 0000 0104 ' '0000000025 184 004 001 11 -99.0 -97.0 -103.0 -104.0 00000000006369.063322 ff:ff:ff:ff:ff:ff 04:e5:48:00:10:00 0x0000 Yes 0000 0104 ' '0000000026 184 004 001 11 -100.0 -98.0 -103.0 -104.0 00000000006370.061493 ff:ff:ff:ff:ff:ff 04:e5:48:00:10:00 0x0000 Yes 0000 0104 ' '0000000027 184 004 001 11 -103.0 -98.0 -106.0 -107.0 00000000006371.060627 ff:ff:ff:ff:ff:ff 04:e5:48:00:10:00 0x0000 Yes 0000 0104 ' '0000000028 184 004 001 11 -98.0 -96.0 -106.0 -104.0 00000000006372.060709 ff:ff:ff:ff:ff:ff 04:e5:48:00:10:00 0x0000 Yes 0000 0104 ' '0000000029 184 004 001 11 -98.0 -95.0 -106.0 -104.0 00000000006373.062754 ff:ff:ff:ff:ff:ff 04:e5:48:00:10:00 0x0000 Yes 0000 0104 ' '0000000030 184 004 001 11 -97.0 -99.0 -106.0 -107.0 00000000006374.061901 ff:ff:ff:ff:ff:ff 04:e5:48:00:10:00 0x0000 Yes 0000 0104 ' '0000000031 184 004 001 11 -99.0 -101.0 -106.0 -104.0 00000000006375.060098 ff:ff:ff:ff:ff:ff 04:e5:48:00:10:00 0x0000 Yes 0000 0104 ' '1610612736 184 000 001 10 -66.0 -68.0 -103.0 -104.0 00000000006376.092635 33:33:00:00:00:16 ea:99:0e:0a:53:f7 0x86dd Yes 0090 0100 ' '1170211136 184 006 001 10 -68.0 -69.0 -106.0 -104.0 00000000006376.167784 ff:ff:ff:ff:ff:ff ea:99:0e:0a:53:f7 0x0800 Yes 0085 0324 ' '1610612736 184 000 001 10 -68.0 -68.0 -103.0 -104.0 00000000006376.176339 33:33:00:00:00:16 ea:99:0e:0a:53:f7 0x86dd Yes 0090 0100 ' '1610612736 184 000 001 10 -66.0 -67.0 -103.0 -110.0 00000000006376.292866 33:33:00:00:00:16 ea:99:0e:0a:53:f7 0x86dd Yes 0090 0100 ' '1610612736 184 000 001 10 -65.0 -68.0 -103.0 -104.0 00000000006376.335958 33:33:ff:86:c6:88 ea:99:0e:0a:53:f7 0x86dd Yes 0066 0076 ' '0000000033 184 004 001 11 -97.0 -97.0 -103.0 -101.0 00000000006377.063311 ff:ff:ff:ff:ff:ff 04:e5:48:00:10:00 0x0000 Yes 0000 0104 ' '1610612736 184 000 001 10 -65.0 -65.0 -106.0 -104.0 00000000006377.362680 33:33:00:00:00:16 ea:99:0e:0a:53:f7 0x86dd Yes 0090 0100 ' '1610612736 184 000 001 10 -64.0 -65.0 -103.0 -107.0 00000000006377.371204 33:33:00:00:00:16 ea:99:0e:0a:53:f7 0x86dd Yes 0090 0100 ' '1610612736 184 000 001 10 -63.0 -65.0 -106.0 -104.0 00000000006377.456220 33:33:00:00:00:16 ea:99:0e:0a:53:f7 0x86dd Yes 0090 0100 ' '1610612736 184 000 001 10 -65.0 -64.0 -103.0 -107.0 00000000006377.524713 33:33:00:00:00:16 ea:99:0e:0a:53:f7 0x86dd Yes 0090 0100 ' '1611381948 184 000 001 10 -64.0 -68.0 -106.0 -104.0 00000000006378.050636 33:33:00:00:00:02 ea:99:0e:0a:53:f7 0x86dd Yes 0042 0052 ' '0000000034 184 004 001 11 -96.0 -98.0 -106.0 -107.0 00000000006378.062425 ff:ff:ff:ff:ff:ff 04:e5:48:00:10:00 0x0000 Yes 0000 0104 ' '1611574502 184 000 001 10 -64.0 -69.0 -106.0 -107.0 00000000006378.086883 33:33:00:00:00:fb ea:99:0e:0a:53:f7 0x86dd Yes 0187 0198 ' '1170211136 184 006 001 10 -65.0 -67.0 -106.0 -104.0 00000000006378.372313 ff:ff:ff:ff:ff:ff ea:99:0e:0a:53:f7 0x0800 Yes 0085 0324 ' '1611574502 184 000 001 10 -65.0 -66.0 -106.0 -104.0 00000000006379.268748 33:33:00:00:00:fb ea:99:0e:0a:53:f7 0x86dd Yes 0086 0097 ' '0000000036 184 004 001 11 -97.0 -101.0 -106.0 -107.0 00000000006380.061666 ff:ff:ff:ff:ff:ff 04:e5:48:00:10:00 0x0000 Yes 0000 0104 ' '1611574502 184 000 001 10 -63.0 -67.0 -103.0 -104.0 00000000006380.171144 33:33:00:00:00:fb ea:99:0e:0a:53:f7 0x86dd Yes 0187 0198 ' '1611381948 184 000 001 10 -62.0 -66.0 -106.0 -104.0 00000000006382.051341 33:33:00:00:00:02 ea:99:0e:0a:53:f7 0x86dd Yes 0042 0052 ' '1170211136 184 006 001 10 -63.0 -69.0 -103.0 -104.0 00000000006382.867419 ff:ff:ff:ff:ff:ff ea:99:0e:0a:53:f7 0x0800 Yes 0085 0324 ' '0000000039 184 004 001 11 -96.0 -100.0 -103.0 -104.0 00000000006383.060977 ff:ff:ff:ff:ff:ff 04:e5:48:00:10:00 0x0000 Yes 0000 0104 ' '1611574502 184 000 001 10 -63.0 -67.0 -103.0 -107.0 00000000006383.269347 33:33:00:00:00:fb ea:99:0e:0a:53:f7 0x86dd Yes 0086 0097 ' '1611381948 184 000 001 10 -66.0 -64.0 -106.0 -104.0 00000000006386.049936 33:33:00:00:00:02 ea:99:0e:0a:53:f7 0x86dd Yes 0042 0052 ' '0000000043 184 004 001 11 -99.0 -97.0 -103.0 -107.0 00000000006387.067579 ff:ff:ff:ff:ff:ff 04:e5:48:00:10:00 0x0000 Yes 0000 0104 '
  2 Comments
antennist
antennist on 19 Jan 2022
Hi, Thank you for your help. Now atleast I can read complete file as a text. However, I am getting following error
Error in cell2mat (line 83)
m{n} = cat(1,c{:,n});
Error in test (line 13)
data=cell2mat(data);
Can you please provide quick fix for this?
Rik
Rik on 19 Jan 2022
This can happen if one line is a different length, for example a trailing empty line. If that is the cause you can easily remove it with this:
data(cellfun('prodofsize',data)==0)=[];

Sign in to comment.

More Answers (1)

Stephen23
Stephen23 on 19 Jan 2022
Edited: Stephen23 on 19 Jan 2022
Move the "header" to the top:
txt = fileread('sample test data.txt');
[hdr,spl] = regexp(txt,'^#[^\n]+','match','split','lineanchors');
fid = fopen('temp.txt','wt');
fprintf(fid,'%s\n',[' ',hdr{1}(2:end)],spl{:});
fclose(fid);
Import all data into a table:
one = regexp(spl{1},'^[^\n]+','match','once');
[idb,ide] = regexp(one,'\s*\S+');
wid = diff([0,ide]);
opt = detectImportOptions('temp.txt', 'FileType','fixedwidth', 'VariableWidths',wid);
format long G
tbl = readtable('temp.txt',opt)
Warning: Column headers from the file were modified to make them valid MATLAB identifiers before creating variable names for the table. The original column headers are saved in the VariableDescriptions property.
Set 'VariableNamingRule' to 'preserve' to use the original column headers as table variable names.
tbl = 39×16 table
SeqNum ChN Pri Srv MC PowerA PowerB NoiseA NoiseB TimeStamp_s_ DestMACAddr SrcMACAddr EthTyp Mat Errs Leng __________ ___ ___ ___ __ ______ ______ ______ ______ ____________ _____________________ _____________________ ______ _______ ____ ____ 1611574502 184 0 1 10 -73 -80 -106 -104 6346.226848 {'33:33:00:00:00:fb'} {'ea:99:0e:0a:53:f7'} 34525 {'Yes'} 86 97 1170211136 184 6 1 10 -73 -75 -103 -104 6346.590584 {'ff:ff:ff:ff:ff:ff'} {'ea:99:0e:0a:53:f7'} 2048 {'Yes'} 85 324 17 184 4 1 11 -99 -98 -103 -107 6361.062418 {'ff:ff:ff:ff:ff:ff'} {'04:e5:48:00:10:00'} 0 {'Yes'} 0 104 1611574502 184 0 1 10 -68 -67 -103 -107 6362.239553 {'33:33:00:00:00:fb'} {'ea:99:0e:0a:53:f7'} 34525 {'Yes'} 86 97 19 184 4 1 11 -101 -98 -103 -104 6363.062618 {'ff:ff:ff:ff:ff:ff'} {'04:e5:48:00:10:00'} 0 {'Yes'} 0 104 1170211136 184 6 1 10 -67 -66 -106 -104 6363.16921 {'ff:ff:ff:ff:ff:ff'} {'ea:99:0e:0a:53:f7'} 2048 {'Yes'} 85 324 20 184 4 1 11 -99 -96 -103 -104 6364.062714 {'ff:ff:ff:ff:ff:ff'} {'04:e5:48:00:10:00'} 0 {'Yes'} 0 104 21 184 4 1 11 -99 -96 -103 -104 6365.062836 {'ff:ff:ff:ff:ff:ff'} {'04:e5:48:00:10:00'} 0 {'Yes'} 0 104 22 184 4 1 11 -99 -98 -106 -104 6366.062957 {'ff:ff:ff:ff:ff:ff'} {'04:e5:48:00:10:00'} 0 {'Yes'} 0 104 25 184 4 1 11 -99 -97 -103 -104 6369.063322 {'ff:ff:ff:ff:ff:ff'} {'04:e5:48:00:10:00'} 0 {'Yes'} 0 104 26 184 4 1 11 -100 -98 -103 -104 6370.061493 {'ff:ff:ff:ff:ff:ff'} {'04:e5:48:00:10:00'} 0 {'Yes'} 0 104 27 184 4 1 11 -103 -98 -106 -107 6371.060627 {'ff:ff:ff:ff:ff:ff'} {'04:e5:48:00:10:00'} 0 {'Yes'} 0 104 28 184 4 1 11 -98 -96 -106 -104 6372.060709 {'ff:ff:ff:ff:ff:ff'} {'04:e5:48:00:10:00'} 0 {'Yes'} 0 104 29 184 4 1 11 -98 -95 -106 -104 6373.062754 {'ff:ff:ff:ff:ff:ff'} {'04:e5:48:00:10:00'} 0 {'Yes'} 0 104 30 184 4 1 11 -97 -99 -106 -107 6374.061901 {'ff:ff:ff:ff:ff:ff'} {'04:e5:48:00:10:00'} 0 {'Yes'} 0 104 31 184 4 1 11 -99 -101 -106 -104 6375.060098 {'ff:ff:ff:ff:ff:ff'} {'04:e5:48:00:10:00'} 0 {'Yes'} 0 104

Categories

Find more on Data Import and Export in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!