UTF-8
This article is a stub. You can help the IndieWeb wiki by expanding it.
UTF-8 is a way to encode Unicode characters in variable number of bytes per character. This is known as a multi-byte encoding scheme. UTF-8 is the most widely used encoding scheme for HTML pages on the web.[1]
Why
You should use UTF-8 in your text editor or whatever tools you use to create posts so you can more easily type and paste characters like curly quotes or names with accent marks and have them show up properly on your site.
How to
Using UTF-8
When writing your HTML, or your scripting language that generates the HTML (PHP, Python, etc.) set the encoding in your text editor to UTF-8. Then we need to tell the browser when it receives the HTML that we are using UTF-8. There are two ways of doing this. Firstly set the Content-Type
HTTP response header, e.g.
Content-Type: text/html; charset=utf-8
Secondly to include the charset within the HTML document. The recommended way to do this in a HTML5 document is to use a meta
tag early on like so:
<!DOCYTPE html> <html> <head> <meta charset="UTF-8"> ...
Warning, if these charset values donβt match, the browser will prioritise the charset defined in the HTTP header over any charset defined within the document itself.