Weird spaces undetected by strfind
Show older comments
I am trying to locate a phrase called "Faceresponse.acc" in a large array. I used this code to get all of the text out of my .txt file:
fid = fopen('HWFC_Car_P New-1-1.txt', 'r');
for i = 1:6600
mystuff{i} = fgetl(fid);
end
fclose('all');
%The text in the text file looks like "Faceresponse.acc" but for some reason when I look at some of the strings in 'mystuff', it's spaced out: 'F a c e r e s p o n s e'. So I found this phrase in mystuff{977}, made sure it was class "char", and did this:
b = mystuff{977} strfind(b, 'F a c e')
It returned nothing. I tried just typing out 'F a c e r e s p o n s e' and assigning it to 'b', instead of using mystuff{977}, and strfind had no trouble locating the spaced out 'F a c e'. I also tried strfind for other things in the string, and it was fine. But it would NOT index the spaces between the letters.
So my question: what's going on here? What are the spaces between the letters if strfind is not identifying them as spaces?
Accepted Answer
More Answers (2)
This sounds like an encoding problem. To find out what encoding your installation of Matlab uses:
feature('DefaultCharacterSet')
To see the encoding of your file, you could do like this, but the best bet would probably be to ask the author of the program that generated the text file.
If the encodings are different, then you could try changing the encoding of Matlab to the one of the file, e.g.:
feature('DefaultCharacterSet', 'UTF8')
But as Matt says, it is probably easier to pass addtional arguments to fopen()
2 Comments
Walter Roberson
on 1 Nov 2012
The article is a fairly reasonable summary. It does make a little mistake right near the end where it says that char() takes unicode values; the mistake is that char() only takes unicode values up to 65535 and does not provide any mechanism for codepoints above that. There are two unicode related routines that can be used to deal with code points above that or to deal with "code pages" or the like.
José-Luis
on 1 Nov 2012
Good to know. Thank you.
Matt J
on 1 Nov 2012
0 votes
FOPEN let's you specify different encoding schemes for reading from the file. I'm guessing that you might need a different one from what you're using. Was the file created on the same platform as you're now using to read it?
Categories
Find more on Text Data Preparation in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!