Internationalization: Strings as locales argument
ecmascript at norbertlindenberg.com
Tue Jul 17 12:26:29 PDT 2012
Let's phrase this as clear alternatives for what the spec could say:
(1) Don't say anything about empty strings specifically. With the change discussed earlier, where a string is mapped to an array containing the string, empty strings will result in a RangeError because they're not valid BCP 47 language tags.
Pro: This lets application developers decide how the empty string should be handled, possibly using different choices for HTML/XML contexts and other contexts. We can pick the more popular answer in a later edition of the spec after seeing what application developers do.
Con: Application developers have to deal with this case. Some will forget to do so and be surprised by exceptions.
(2) Treat an empty string like undefined. Constructors will use the default locale; supportedLocalesOf will return an empty list.
Pro: Applications don't get exceptions for this case; we're mapping the HTML/XML notion of "we don't know the locale" to our notion of "we don't know the locale".
Con: The default behavior may not be appropriate in all cases. We deviate from BCP 47 also in cases where no HTML/XML is involved.
(3) Require that implementations interpret an empty string as a request for a (mostly) language independent locale.
Pro: This may be what applications want to happen when using an empty string; it may be appropriate in the case where HTML/XML don't provide a lang attribute.
Con: The spec doesn't require support for a language independent locale, and doesn't say what it would look like, so this doesn't really guarantee anything to application developers.
I'm leaning towards (1).
On Jul 16, 2012, at 8:41 , Phillips, Addison wrote:
> The empty string isn't a valid BCP 47 language tag, but it is a valid value for xml:lang and HTML @lang. So my main concern would be to make the "pass through" somewhat seamless. That is, I'm more concerned with that becoming an error condition than I am concerned with what happens with that value.
> I agree that the empty value isn't very useful in determining what to do. For what it's worth, I supported a language-independent "root" locale, although there is scant difference between the empty string and using the "und" tag. I tend, personally, to favor the empty string over "und", because the "und" tag makes it look like there is data there (harder to special case). But the problem remains of what behavior to assign to the "no locale available" case and whether that should be normative or implementation defined.
>> -----Original Message-----
>> From: Norbert Lindenberg [mailto:ecmascript at norbertlindenberg.com]
>> Sent: Sunday, July 15, 2012 9:54 PM
>> To: Phillips, Addison
>> Cc: Norbert Lindenberg; es-discuss
>> Subject: Re: Internationalization: Strings as locales argument
>> Empty strings are not valid BSP 47 language tags and would not qualify with or
>> without my proposed change. Without the change, they'd be interpreted as an
>> empty list, so the constructors would eventually fall back to the default locale,
>> while supportedLocalesOf would return an empty array.
>> UTS 35, section 3.2.2, specifies that the Unicode locale identifier "root" is
>> mapped to the BCP 47 language tag "und". I once proposed that our API should
>> require support for a language-independent locale (to the extent that that's
>> possible); that proposal didn't find approval.
>> In XML and HTML, an empty language tag means "no language information
>> available" or "primary language is unknown" [1, 2]. If within such content
>> language sensitive operations are necessary, someone has to decide which
>> language to assume. Should that be the Internationalization API? Maybe an
>> application would find "und" more appropriate in this situation than the default
>>  http://www.w3.org/TR/2008/REC-xml-20081126/#sec-lang-tag
>>  http://dev.w3.org/html5/spec/global-attributes.html#the-lang-and-
>> On Jul 14, 2012, at 13:49 , Phillips, Addison wrote:
>>>> The result would be that "" is rejected with a RangeError, but
>>>> "en-US" is processed as ["en-US"].
>>> Would there be some means of referencing the "root" locale other than using
>> the empty string?
>>> Also, one means of assigning a locale would be to scrape one or another
>> @lang attribute in some HTML or XML content. If that attribute were empty,
>> would RangeError be an expected outcome? Wouldn't it be better to handle the
>> empty string gracefully, since it isn't necessarily an error condition?
More information about the es-discuss