Computer systems internally store data in binary representation. A character is stored using a combination of 0's and 1's. The process is called encoding. A character encoding scheme is important because it helps to represent the same information on multiple types of devices.
Types of Encoding
Following are the different types of encoding used before the Unicode system.
Why does Java use Unicode System?
There were a few limitations to the encoding techniques used before the Unicode system.
These problems led to finding a better solution for character encoding that is Unicode System.
What is Unicode System?
Program to convert UTF-8 to Unicode
In the above code, a Class UnicodeDemo is created. At the start, a Unicode String str1 is converted into a UTF-8 form using the getBytes() method. After that, the byte array is again converted into Unicode and the value of newstr is displayed on the console.
Problem Caused by Unicode
The Unicode standard was designed to represent 16-bit character encoding. It was supposed to be capable of representing all the characters in the world using the primitive data type char. But the 16-bits encoding was able to represent only 65,536 characters that were not sufficient for all the characters available around the world.
So, the Unicode system was extended up to 1,112,064 characters. The characters that are larger than 16-bits are called Supplementary characters and are defined by Java using a pair of char values.
In this article, we have discussed basic methods of encoding, Unicode System in Java, problems caused by Unicode system, and a Java program for demonstrating the use of Unicode system.