Main Content

native2unicode

Convert numeric bytes to Unicode character representation

Syntax

unicodestr = native2unicode(bytes)
unicodestr = native2unicode(bytes, encoding)

Description

unicodestr = native2unicode(bytes) converts a numeric vector, bytes, from the user default encoding to a Unicode® character representation. native2unicode treats bytes as a vector of 8-bit bytes, and each value must be in the range [0,255]. The output argument unicodestr is a character vector having the same general array shape as bytes.

unicodestr = native2unicode(bytes, encoding) converts bytes to a Unicode representation with the assumption that bytes is in the character encoding scheme specified by encoding. The input argument encoding must have no characters ('') or it must be a name or alias for an encoding scheme. Some examples are 'UTF-8', 'latin1', 'US-ASCII', and 'Shift_JIS'. If encoding is unspecified or has no characters (''), the default encoding scheme is used. encoding can be a character vector or a string scalar.

Note

If bytes is a character vector or a string scalar, it is returned unchanged.

Examples

This example begins with a vector of bytes in an unknown character encoding scheme. The user-written function detect_encoding determines the encoding scheme. If successful, it returns the encoding scheme name or alias as a character vector. If unsuccessful, it throws an error represented by an MException object, ME. The example calls native2unicode to convert the bytes to Unicode representation:

try
    enc = detect_encoding(bytes);
    str = native2unicode(bytes, enc);
    disp(str);
catch ME
    rethrow(ME);
end

Note that the computer must be configured to display text in a language represented by the detected encoding scheme for the output of disp(str) to be correct.

Introduced before R2006a