"However what I could think of is that Matlab tries to guess the encoding"
I've had discussions with Mathworks support about this. The whole process is not properly documented unfortunately, which I told them can be a problem.
Indeed, if you open a file without specifying a character encoding, matlab will try to guess the file encoding the first time you either:
- use any character reading function such as fgetl, fgets, fscanf, etc.
- use fread with a 'char' or '*char' precision
- ask for the encoding with the multi-output version of fopen.
I haven't been given the full process of character set detection, but it does read the whole file which indeed can be an issue for large files. If any byte sequence in the file is not a valid UTF8 code point, then the algorithm uses some heuristics to see if it's a CJK encoding and if it still doesn't match, it assumes the local encoding.
To prevent this autodetection to take place, you have to specify an encoding when you fopen the file. If you don't know what the encoding is for your binary file, I'd suggest using 'US-ASCII'. As we mentioned in the comment, it's unlikely that a binary file uses UTF8 unless it prefixes the text by a length.
Unfortunately, it's not easy to go back to pre-2020a behaviour of automatically using the native encoding whatever it is, as R2020a has lost the ability of easily getting the local encoding. On the other hand, relying on native encoding when reading a binary file is asking for trouble.
17 Comments
Direct link to this comment
https://in.mathworks.com/matlabcentral/answers/512803-why-do-i-get-out-of-memory-when-reading-only-16-chars#comment_815026
Direct link to this comment
https://in.mathworks.com/matlabcentral/answers/512803-why-do-i-get-out-of-memory-when-reading-only-16-chars#comment_815026
Direct link to this comment
https://in.mathworks.com/matlabcentral/answers/512803-why-do-i-get-out-of-memory-when-reading-only-16-chars#comment_815030
Direct link to this comment
https://in.mathworks.com/matlabcentral/answers/512803-why-do-i-get-out-of-memory-when-reading-only-16-chars#comment_815030
Direct link to this comment
https://in.mathworks.com/matlabcentral/answers/512803-why-do-i-get-out-of-memory-when-reading-only-16-chars#comment_815038
Direct link to this comment
https://in.mathworks.com/matlabcentral/answers/512803-why-do-i-get-out-of-memory-when-reading-only-16-chars#comment_815038
Direct link to this comment
https://in.mathworks.com/matlabcentral/answers/512803-why-do-i-get-out-of-memory-when-reading-only-16-chars#comment_815046
Direct link to this comment
https://in.mathworks.com/matlabcentral/answers/512803-why-do-i-get-out-of-memory-when-reading-only-16-chars#comment_815046
Direct link to this comment
https://in.mathworks.com/matlabcentral/answers/512803-why-do-i-get-out-of-memory-when-reading-only-16-chars#comment_815053
Direct link to this comment
https://in.mathworks.com/matlabcentral/answers/512803-why-do-i-get-out-of-memory-when-reading-only-16-chars#comment_815053
Direct link to this comment
https://in.mathworks.com/matlabcentral/answers/512803-why-do-i-get-out-of-memory-when-reading-only-16-chars#comment_815064
Direct link to this comment
https://in.mathworks.com/matlabcentral/answers/512803-why-do-i-get-out-of-memory-when-reading-only-16-chars#comment_815064
Direct link to this comment
https://in.mathworks.com/matlabcentral/answers/512803-why-do-i-get-out-of-memory-when-reading-only-16-chars#comment_815070
Direct link to this comment
https://in.mathworks.com/matlabcentral/answers/512803-why-do-i-get-out-of-memory-when-reading-only-16-chars#comment_815070
Direct link to this comment
https://in.mathworks.com/matlabcentral/answers/512803-why-do-i-get-out-of-memory-when-reading-only-16-chars#comment_815080
Direct link to this comment
https://in.mathworks.com/matlabcentral/answers/512803-why-do-i-get-out-of-memory-when-reading-only-16-chars#comment_815080
Direct link to this comment
https://in.mathworks.com/matlabcentral/answers/512803-why-do-i-get-out-of-memory-when-reading-only-16-chars#comment_815089
Direct link to this comment
https://in.mathworks.com/matlabcentral/answers/512803-why-do-i-get-out-of-memory-when-reading-only-16-chars#comment_815089
Direct link to this comment
https://in.mathworks.com/matlabcentral/answers/512803-why-do-i-get-out-of-memory-when-reading-only-16-chars#comment_815106
Direct link to this comment
https://in.mathworks.com/matlabcentral/answers/512803-why-do-i-get-out-of-memory-when-reading-only-16-chars#comment_815106
Direct link to this comment
https://in.mathworks.com/matlabcentral/answers/512803-why-do-i-get-out-of-memory-when-reading-only-16-chars#comment_815185
Direct link to this comment
https://in.mathworks.com/matlabcentral/answers/512803-why-do-i-get-out-of-memory-when-reading-only-16-chars#comment_815185
Direct link to this comment
https://in.mathworks.com/matlabcentral/answers/512803-why-do-i-get-out-of-memory-when-reading-only-16-chars#comment_815452
Direct link to this comment
https://in.mathworks.com/matlabcentral/answers/512803-why-do-i-get-out-of-memory-when-reading-only-16-chars#comment_815452
Direct link to this comment
https://in.mathworks.com/matlabcentral/answers/512803-why-do-i-get-out-of-memory-when-reading-only-16-chars#comment_815569
Direct link to this comment
https://in.mathworks.com/matlabcentral/answers/512803-why-do-i-get-out-of-memory-when-reading-only-16-chars#comment_815569
Direct link to this comment
https://in.mathworks.com/matlabcentral/answers/512803-why-do-i-get-out-of-memory-when-reading-only-16-chars#comment_815652
Direct link to this comment
https://in.mathworks.com/matlabcentral/answers/512803-why-do-i-get-out-of-memory-when-reading-only-16-chars#comment_815652
Direct link to this comment
https://in.mathworks.com/matlabcentral/answers/512803-why-do-i-get-out-of-memory-when-reading-only-16-chars#comment_815862
Direct link to this comment
https://in.mathworks.com/matlabcentral/answers/512803-why-do-i-get-out-of-memory-when-reading-only-16-chars#comment_815862
Direct link to this comment
https://in.mathworks.com/matlabcentral/answers/512803-why-do-i-get-out-of-memory-when-reading-only-16-chars#comment_832571
Direct link to this comment
https://in.mathworks.com/matlabcentral/answers/512803-why-do-i-get-out-of-memory-when-reading-only-16-chars#comment_832571
Direct link to this comment
https://in.mathworks.com/matlabcentral/answers/512803-why-do-i-get-out-of-memory-when-reading-only-16-chars#comment_837789
Direct link to this comment
https://in.mathworks.com/matlabcentral/answers/512803-why-do-i-get-out-of-memory-when-reading-only-16-chars#comment_837789
Sign in to comment.