HTML Charset

HTML Charset is also called HTML Character Sets or HTML Encoding. It is used to display an HTML page properly and correctly because for displaying anything correctly, a web browser must know which character set (character encoding) to use.

HTML Character Encoding

There are various types of Character Encoding which are given below:

ASCII Character Set

ASCII stands for American Standard Code for Information Interchange. In HTML, the first ever character encoding standard is the ASCII standard. ASCII provides 128 different alphanumeric characters that could be used on the internet: numbers (0-9), English letters (A-Z), and some special characters like! $ + - ( ) @ <> .

The main problem with ASCII encoding was it contains a limited range of characters. It contains mainly 128 characters.

ANSI Character Set

ANSI stands for American National Standard Institute. It is character set standard which is an extended version of standard ASCII character set. It supports 256 character set. ANSI also called as Windows-1252, and it was the default character set for Windows up to Windows 95.

ISO-8859-1 Character Set

ISO-8859-1 was the default character encoding in HTML 2.0. It was also an extension of ASCII standard with International characters. It also used full bytes (8-bits) to show characters.

UTF-8 Character Set

UTF-8 is a variable width character encoding which covers almost all of the characters and symbols in the world. ANSI (Windows-1252) was the original Windows character set, which supported 256 different character codes.

ISO-8859-1 was the default character set for HTML 4. This character set also supported 256 different character codes.

Why UTF 8 is also supported in HTML4?

Because ANSI and ISO-8859-1 were so limited, HTML 4 also supported UTF-8.The default character encoding for HTML5 is UTF-8.

UTF-8 syntax for HTML4:

UTF-8 syntax for HTML5:

Next TopicHTML URL Encode

← prev next →