Friday, 21 February 2014

Difference Between UTF-8 and UTF-16

UTF-8 vs UTF-16

In this article, I am going to write key points about what is UTF and difference between UTF-8 and UTF-16.

What is UTF

   UTF stands for Unicode Transformation Format. It is a family of standards for encoding the Unicode character set into its equivalent binary value. UTF was developed so that users have a standardized means of encoding the characters with the minimal amount of space.

UTF-8:

- UTF-8 is a variable-width encoding that can represent every character in the Unicode character set.
- UTF-8 was designed for backward compatibility with ASCII.
- UTF-8 is byte oriented format and therefore has no problems with byte oriented networks or file.
- UTF-8 uses 1 byte at the minimum in encoding the characters.
- UTF-8 is also better in recovering from errors that corrupt portions of the file or stream as it can still decode the next uncorrupted byte.

UTF-16:

- UTF-16 is a character encoding for Unicode capable of encoding 1,112,064 numbers (called code points) in the Unicode code space from 0 to 0x10FFFF.
- UTF-16 is not byte oriented and needs to establish a byte order in order to work with byte oriented networks
- UTF-16 uses 2 bytes at the minimum in encoding the characters.



Thanks,
Morgan
Software Developer

Advertisements
Advertisements

No comments:

Post a Comment