Different language in title and content of the ABBR HTML tag
Asked Answered
A

1

6

Suppose I'm writing an article in HTML. The language of the article is Swedish, so I have <html lang="sv">. Now I want to mark up the abbreviation properly in following text:

HTML kan användas till mycket.

To this end, I first do

<abbr title="HyperText Markup Language">HTML</abbr> kan användas till mycket.

This alone is not good enough, however, because the language of the title attribute is Swedish (sv). Besides being a theoretical problem, this will make screen readers pronounce the title in a highly awkward way. To remedy this, I could do

<abbr title="HyperText Markup Language" lang="en">HTML</abbr> kan användas
  till mycket.

This is even worse, though, since now the abbreviation 'HTML' will be read in Enligsh instead of Swedish [so from a Swedish point of view, it will sound like "ejtsch-ti-emm-ell" instead of "hå-te-emm-ell"].

Hence, the abbreviation, or the text contents of the abbr node, should be in Swedish, but the title attribute should be in English. What is the preferred (HTML5) way of marking this up? Is it

<abbr title="HyperText Markup Language" lang="en">
  <span lang="sv">HTML</span>
</abbr> kan användas till mycket.

?

Alta answered 30/7, 2013 at 12:3 Comment(4)
If that works the way you want, by all means use it. (I don't have a Swedish text reader, so I can't check if it works correctly!)Prostrate
Mr Lister: I don't have any screen reader at all to test with... No, it is the other way around. acronym is obsolete; you use abbr for all abbreviations today (I just read the HTML5 spec from top to bottom).Alta
By the way, there are almost 200,000 questions about HTML here, so I think you're in the right place.Prostrate
Oops, sorry about the acronym. Brain fart!Prostrate
P
2

Your conclusion is correct: In language markup in HTML, you cannot indicate the content of an element as being in a language other than its attribute values, since the lang attribute sets both of them. And the workaround is the one you have found: use inner markup for the content. There’s no difference here between HTML 4 and HTML5.

However, this is a very theoretical issue.

First, the abbr markup is almost useless in practice. Abbreviations should be explained, when needed, in normal text content, not in attributes. Speech browsers may optionally read title attribute values, but in normal mode, they ignore them – people using speech browsers prefer fast reading and are often accustomed to rather high speech rates, and spelling out abbreviations would disturb this.

Second, “abbreviations” like “HTML” (which is really a proper name rather than anything else) should seldom be spelled out in speech. You wouldn’t want to hear speech like “The new version of HyperText Markup Language is HyperText Markup Language five, which has many extensions to HyperText Markup Language four.”

Third, language markup is largely write-only. In most situations, it is just ignored. Google does not care. Browsers may use it to decide on default font to be used, but most pages specify their own fonts, so the defaults don’t matter. Some speech browsers may recognize a few languages from lang attributes, but most of them don’t: they read the content by the rules for the language selected by the user. Those that use language markup may make a distinction between British and US English, so if you still think language markup is relevant, consider using lang="en-GB" in this context. (I’m assuming that most Swedish-speaking people would find Received Pronunciation more understandable and natural than Standard American, but I might be wrong.)

Parenthesize answered 30/7, 2013 at 13:10 Comment(4)
Thank you for your answer, and +1. I just want to add one thing: I believe that marking up abbreviations is not only, maybe not even mainly, for the benefit of screen readers. The way I see it, it may be very useful to people not familiar with the terminology (if the page mainly targets people that are familiar with it), especially if there isn't a bijection between the set of abbreviations and spelled-out texts. For instance, does 'CAD' stand for 'Coronary artery disease' or 'Computer-aided design'? If the text targets computer-savy people or medical pros, you might not spell it outAlta
In addition, even if you (basically) know what the abbreviation stands for, you might want to check the precise spelling of the text, and then it is convenient to have it in the title.Alta
The implementations of the title attribute in browsers are lousy (tiny text that disappears etc.), and authors are more and more using “CSS tooltips” instead (e.g., using an element that is initially hidden but becomes visible when some element is moused over). Such approaches do not have the language markup problem discussed here, since text content does not appear in an attribute but as element content.Parenthesize
I agree on the issues with the current implemtations. But if you use a hidden div instead, you lose semantics (unless you do something complicated to regain it). A different solution is to still use title and something like abbr[title]:hover:after { content:attr(title); border:1px solid black; padding;2px} (but preferably better), but I'm a big fan of the 'KISS' principle.Alta

© 2022 - 2024 — McMap. All rights reserved.