UTF-8 is a way to encode Unicode characters in variable number of bytes per character. This is known as a multi-byte encoding scheme. UTF-8 is the most widely used encoding scheme for HTML pages on the web.
When writing your HTML, or your scripting language that generates the HTML (PHP, Python, etc.) set the encoding in your text editor to UTF-8. Then we need to tell the browser when it receives the HTML that we are using UTF-8. There are two ways of doing this. Firstly set the
Content-Type HTTP response header, e.g.
Content-Type: text/html; charset=utf-8
Secondly to include the charset within the HTML document. The recommended way to do this in a HTML5 document is to use a
meta tag early on like so:
<!DOCYTPE html> <html> <head> <meta charset="UTF-8"> ...
Warning, if these charset values don’t match, the browser will prioritise the charset defined in the HTTP header over any charset defined within the document itself.