language

From IndieWeb
(Redirected from language detection)

language may refer to human (or natural) languages or computer (often programming) languages.

If you are looking for wiki pages in other languages, see:

Why marking up

Before you consider marking up your page with the appropriate language-tags, consider why you are marking up. Don't just mark up because you can markup.

Filtering

When marking up a h-entry of a post with a lang attribute, you enable users of a reader to filter out a certain language they don't speak. Thus making it possible to follow a user only in a specific language you speak.

Pelle Wessman on chat: "on Twitter I often don't follow people that tweet too much in a language I don't understand and I hold back on tweeting in swedish because I know it might likewise annoy others"

Twitter does filter on language in Search, but not on the timeline.

Screen readers / text to speach

When someone uses a screen reader, the marked up language can be used to select the right pronunciation rules.

  • This post by Sebastiaan Andeweg is a Dutch transcription of English and would thus be best marked up as 'nl', to guide screen readers toward the right pronunciation.
  • Martijn van der Ven used to mark up his name with lang="nl" to guide screen readers towards the right pronunciation of his name.

Translations

Translation software can translate certain posts or texts if it knows the language.

  • Most translation software can probably detect the language too?


How to mark up

You can specify the language of a HTML document, or a part of it, by using the lang="??" attribute, where ?? is the language-code for your language. For English, this is en, en-GB or en-US.

HTML also allows you to mark the language of the target of a hyperlink using the hreflang attribute.

HTML 5 has also introduced a translate attribute that allows you to specify that a piece of text ought to not be automatically translated.

  • There are thoughts on how to parse lang in Microformats.

Language detection

Christian Weiske uses language detection to automatically create the <html lang="??"> attribute for blog posts from the post's title.

FAQ

Q: Why detect instead of adding manually?

  • Less tedious, less prone to errors

Q: Why detect yourself, if others can detect too?

  • Because sometimes they don't, but do things with the lang-attribute.
  • Detect once while publishing vs. detect again and again and again and again

See Also