Can't read formatted data (textread, textscan, others)

For the life of me, I can't figure out how to properly use textread, textscan and other similar formatted text functions. I'd like to read a file like this ('test2.txt'):
header1 header2 header3 header4
abc 1 2 3
def 4 5 6
ghi 7 8 9
And build a matrix = [1 2 3; 4 5 6; 7 8 9] and a cell array containing {abc; def; ghi}. From examples posted here and elsewhere, this should work:
fid = fopen('test2.txt');
data = textscan(fid,'%s %f %f %f','delimiter',' ','headerlines',1)
fclose(fid);
But it doesnt. Output:
data =
{1x1 cell} [0x1 double] [0x1 double] [0x1 double]
and the 1x1 cell contains just ''
I've since tried ~ a dozen other examples of this function and similar functions and haven't gotten any to work!
For example: http://www.mathworks.com/matlabcentral/answers/21810-reading-a-text-file - Using Jan's code and the OP's data which is formatted similarly to mine, I get the same problem as above: a bunch of empty cells/vectors.
Another recent post: http://www.mathworks.com/matlabcentral/answers/24995-simple-file-i-o-problem-help-needed - Same deal. Friedrich's solution doesn't give me the same output as he shows.
Finally, I copied/pasted the example in 'help textscan'. Same problem, except the first cell does contain some gibberish 'ÿþS'. What am I missing here? Thanks for your time.
SOLUTION. (In comments section in Walter Roberson's Answer) Notepad defaulted to saving files as Unicode format. When saving a text file in notepad, changing the "Encoding" option (near bottom of save as dialog window) from Unicode to either ANSI or UTF-8 resulted in proper code execution. Thank you!

 Accepted Answer

gibberish 'ÿþ' tells us that your file is encoded by UTF-16 Little Endian.
Please try with
fopen('test2.txt', 'rt')
so that your file is treated as a text file rather than as a binary file.

10 Comments

No dice. Same result. Note, I don't get that gibberish with the first test case, just an empty cell. Still get an empty cell.
There should be no need to use ' ' as the delimiter, and it could be that it is interfering with the parsing. The default is "whitespace" which includes tab and vertical tab and spaces and newlines.
OK, tried that. Also no change! The call now is:
fid = fopen('test2.txt','rt');
data = textscan(fid,'%s %f %f %f','headerlines',1)
Do you have a small file you could try with? If so, could you code
fid = fopen('test2.txt','r');
reshape( fread(fid, 'char=>uint8'), 1, [])
and show us the output of that ?
Sure.
Columns 1 through 22
255 254 104 0 101 0 97 0 100 0 101 0 114 0 49 0 32 0 104 0 101 0
Columns 23 through 44
97 0 100 0 101 0 114 0 50 0 32 0 104 0 101 0 97 0 100 0 101 0
Columns 45 through 66
114 0 51 0 32 0 104 0 101 0 97 0 100 0 101 0 114 0 52 0 13 0
Columns 67 through 88
10 0 97 0 98 0 99 0 32 0 49 0 32 0 50 0 32 0 51 0 13 0
Columns 89 through 110
10 0 100 0 101 0 102 0 32 0 52 0 32 0 53 0 32 0 54 0 13 0
Columns 111 through 130
10 0 103 0 104 0 105 0 32 0 55 0 32 0 56 0 32 0 57 0
Walter--you're definitely onto something. Tried using dlmwrite() to write a matrix, then read it using textscan and it works perfectly. No header though and not mixed data, all %f. But first time I've gotten it to read right. So this has something to do with notepad's formatting?
I have recreated the file here, but unfortunately I do not have access to MATLAB tonight to experiment with it.
In the mean time, you might want to see if notepad has a way to save as plain text, or as UTF-8 .
The file is currently in UTF-16 Little Endian for sure.
Walter, solved. Code now works if I save as as ANSI and UTF-8 formats, but NOT unicode (what it was--and that must be little endian?) and not unicode big endian. Thank you so much. Those are the only 4 options, which do you consider "plain text"? Thanks again.
"plain text" is ASCII or ISO-8896-1
Which MATLAB version are you using? I found a thread indicating a textscan issue in some earlier versions and showing a work-around: http://www.mathworks.com/matlabcentral/answers/16493-textscan-or-import-of-unicode-encoded-textfile

Sign in to comment.

More Answers (0)

Asked:

M S
on 2 Jan 2012

Edited:

on 4 Sep 2015

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!