Problem with importing data from tab delimited .txt file
Show older comments
I have a very long 5-column text data with tab as delimiter, for example say that 'txtfile.txt' has 7 lines:
144141 180738085 two two mc
144141 180738086 of of io
144141 180738087 us us ppio2
144141 180738088 . . .
144141 180738089 " " "
144141 180738090 Hollywood hollywood np1
144141 180738091 Heartbeat heartbeat np1
using importdata i get a 7X1 cell array like the following:
importdata('txtfile.txt','\t') % same for importdata('txtfile.txt')
output:
{'144141→180738085→two→two→mc' }
{'144141→180738086→of→of→io' }
{'144141→180738087→us→us→ppio2' }
{'144141→180738088→.→.→.' }
{'144141→180738089→"→"→"' }
{'144141→180738090→Hollywood→hollywood→np1'}
{'144141→180738091→Heartbeat→heartbeat→np1'}
So importdata doesn't work. If I use readtable I get a 5X5 table like the following:
readtable('txtfile.txt') % also for readtable('txtfile.txt','Delimiter','tab')
output:
Var1 Var2 Var3 Var4 Var5
__________ __________ _______ ________________________________________________________________________________________________ __________
1.4414e+05 1.8074e+08 {'two'} {'two' } {'mc' }
1.4414e+05 1.8074e+08 {'of' } {'of' } {'io' }
1.4414e+05 1.8074e+08 {'us' } {'us' } {'ppio2' }
1.4414e+05 1.8074e+08 {'.' } {'.' } {'.' }
1.4414e+05 1.8074e+08 {'→' } {'←↵144141→180738090→Hollywood→hollywood→np1←↵144141→180738091→Heartbeat→heartbeat→np1'} {0×0 char}
So something about having a quotation mark in the text file ruins it.
Any help would be much appreciated.
5 Comments
Ive J
on 17 Dec 2020
What about this
tan = array2table(split(splitlines(fileread('tab.txt'))));
Mathieu NOE
on 17 Dec 2020
hello
have you tried with textscan ?
for an obscure reason , I had to copy paste the tab from the text file to get a correct output :
% opt = {'Delimiter','tab','CollectOutput',true}; % KO
opt = {'Delimiter',' ','CollectOutput',true};% OK
fmt = '%f%f%s%s%s';
[fid,msg] = fopen('data_tab2.txt','rt');
assert(fid>=3,msg)
out = textscan(fid,fmt,opt{:})
fclose(fid);
gives me :
Mathieu NOE
on 17 Dec 2020
>> out{1}
ans =
144141 180738085
144141 180738086
144141 180738087
144141 180738090
144141 180738091
>> out{2}
ans =
5×3 cell array
{'two' } {'two' } {'mc' }
{'of' } {'of' } {'io' }
{'us' } {'us' } {'ppio2'}
{'Hollywood'} {'hollywood'} {'np1' }
{'Heartbeat'} {'heartbeat'} {'np1' }
Mathieu NOE
on 17 Dec 2020
you can get the same result by combining readlines and split - still remain the question why the tab option is not working in readlines
s = importdata('data_tab.txt','\t');
sp = split(s,' ');
yuval
on 21 Dec 2020
Answers (0)
Categories
Find more on Characters and Strings in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!