UTF-8
This article is a stub. You can help the IndieWeb wiki by expanding it.
UTF-8 is a way to encode Unicode characters in variable number of bytes per character. This is known as a multi-byte encoding scheme. UTF-8 is the most widely used encoding scheme for HTML pages on the web.[1]
Using UTF-8
When writing your HTML, or your scripting language that generates the HTML (PHP, Python, etc.) set the encoding in your text editor to UTF-8. Then we need to tell the browser when it receives the HTML that we are using UTF-8. There are two ways of doing this. Firstly set the Content-Type
HTTP response header, e.g.
Content-Type: text/html; charset=utf-8
Secondly to include the charset within the HTML document. The recommended way to do this in a HTML5 document is to use a meta
tag early on like so:
<!DOCYTPE html> <html> <head> <meta charset="UTF-8"> ...
Warning, if these charset values don’t match, the browser will prioritise the charset defined in the HTTP header over any charset defined within the document itself.