Summary of i18n ad hoc group meeting - 21st March 2011

Jungshik Shin (신정식, 申政湜) jungshik at google.com
Fri Mar 25 15:08:06 PDT 2011


On Fri, Mar 25, 2011 at 1:34 PM, Nebojša Ćirić <cira at google.com> wrote:

> Looking through the notes from the meeting I also found some problems with
> the collator. We did specify the collatorType: search, but we didn't offer a
> function that would make use of it. Mark and I are thinking about:
>
> /**
>  * string - string to search over.
>  * substring - string to look for in "string"
>  * index - start search from index
>  * @return {Array} [first, last] - first is index of the match or -1, last
> is end of the match or undefined.
>  */
> LocaleInfo.Collator.prototype.find(string, substring, index)
>
> I'll start different tread about collation problems...
>
> 1.       “Use Unicode identifier vs. BCP47 in the API” It isn’t clear what
>> you mean by this. I would strongly prefer that we use BCP 47 identifiers. If
>> you mean “allow the Unicode locales extension to BCP 47”, I’m fine, but I
>> don’t see why we would want to use the older LDML identifiers at all.
>>
> I mean the superset of BCP47, or what you say "the Unicode locales
> extension to BCP 47".
>

A bit of clarification: what we have in mind is "BCP 47 with Unicode locale
extension" (ref. http://www.rfc-editor.org/rfc/rfc6067.txt and
http://www.unicode.org/draft/reports/tr35/tr35.html )

I agree that we don't want to support the older LDML identifiers.


>
>
>> 2.       For Collator, I’d like an option “ignoreSpaces”. I know that the
>> default implementation of ICU’s collator ignores spaces, but I have a number
>> of real-world cases in which I want the opposite behavior (in which I tailor
>> my collator using rules).
>>
>> Particularly important to me is the sorting of book titles. If I have two
>> strings: “The New Fishing Bible” and “The Newer Fishing Bible”, I want to
>> sort them in the order given, but most collators ignore spaces (comparing
>> the second ‘e’ in Newer to the ‘F’ in Fishing). So I want the option of
>> making whitespace meaningful for collation.
>>
>> Since most implementations are similar to ICU, though, maybe this would be
>> the one “strange” option (“compareWhitespace”???)
>>
>  I'll let Mark and Jungshik comment on this.
>



>
>
>> 3.       “Start with ICU patterns and see if they work for everybody” I
>> support establishing the pattern language early on. I need to be able to use
>> Message, Choice, and Plural formatters and it’s all too common to need to
>> specify particular patterns in those cases (for substitutions), not to
>> mention direct usage in number and date formatters. The symbols used by
>> ICU/CLDR are common enough and, if incomplete for a given implementers needs
>> (Microsoft?), could be appropriately extended. Since these symbols are also
>> used by Java, PHP, and some other languages, it would be unsurprising to use
>> them in ECMAScript.
>>
> In general I agree that we should settle on the format quickly (i.e. accept
> ICU patterns). I'll wait for others to pitch in with possible concerns.
>
> New quasis proposal for string formatting may help with plural/gender rules
> and general formatting.
>
> I don't like how ICU plural/gender lets you embed formatters for
> date/numbers in the message. It complicates already complex message. I would
> prefer if we just placed the burden on the developer to call proper
> formatters like:
>
> "Some {NUM_KITTENS, plural, ... { Just put a {param} here}}",
> dtf.format(new Date(now));
>
> 4.       “Allow short/long dateType to get value from the system or cloud”I’m not sure I understand this one. Was this the part of the conversation in
>> which we allowed the pattern to be garnered either from the system (e.g. the
>> Regional Options control panel on Windows) or from the implementation??
>>
> This is where the implementation fetches user settings from the system (as
> you said Regional Options). I think we should word it as "Implementation may
> fetch user settings from the OS to format short and long date style.".
> I would be happier if we actually accepted the pattern for short/long date
> from the app - in which case app developer could decide to fetch the user
> settings from system or somewhere else...
>  We also need to put a guidance about what would implementation do if the
> system locale doesn't match with the current page locale.
>




>
>>
>>
>>
>>
>>
>> *From:* Nebojša Ćirić [mailto:cira at google.com]
>> *Sent:* Monday, March 21, 2011 5:29 PM
>> *To:* es-discuss at mozilla.org
>> *Subject:* Summary of i18n ad hoc group meeting - 21st March 2011
>>
>>
>>
>> Started implementation of collator in Chrome and hit a problem:
>>
>>
>>
>> var coll = locale.collator();
>>
>> array.sort(coll.compare);
>>
>>
>>
>> Compare method gets bound to the undefined or global object at the call
>> site. Erik mentioned that this problem will be solved in Harmony by passing
>> additional "this" parameter to for-each and likes. We would like to propose
>> extending this syntax to the sort method too.
>>
>>
>>
>> We discussed each part of the API in order to get detailed parameters of
>> each constructor and method.
>>
>>
>>
>> General:
>>
>>    - Add options property to each class that would give you actual value
>>    for the user parameters. For example, if user asked for islamic calendar,
>>    and we only have islamic-civil, we set calendar property to islamic-civil.
>>    Allows developer to iterate until satisfied with the result.
>>    - Use Unicode identifier vs. BCP47 in the API
>>
>> Collator:
>>
>>    - numeric - specifies numeric sort (9 comes before 12)
>>    - ignoreVariants - ignore all of case, width and kana
>>    - ignoreWidth, ignoreCase and ignoreKana - subvariants we may
>>    implement to fine tune the behavior
>>    - ignoreAccents - ignore accents
>>    - ignoreSymbols - ignore punctuation and symbols
>>    - variant - phonebook, ... - string
>>
>> NumberFormat:
>>
>>    - Allow patterns to specify grouping, currency symbol position and
>>    sign location
>>    - Start with ICU patterns and see if they work for everybody
>>    - Don't support overrides for grouping separator and decimal point for
>>    now
>>    - Specify both currencySymbol and currencyCode as override
>>
>> DateTimeFormatSymbols:
>>
>>    - Added Era and day period methods (AM/PM)
>>    - Moved all methods to DateTimeFormat class
>>    - Remove DTFSymbols class
>>
>> DateTimeFormat:
>>
>>    - Specify calendar names better (move work to Unicode/LDML and point
>>    to their document).
>>    - Allow short/long dateType to get value from the system or cloud.
>>    - .options[skeleton] should contain best match for the given skeleton
>>
>> --
>> Nebojša Ćirić
>>
>
>
>
> --
> Nebojša Ćirić
>
> _______________________________________________
> es-discuss mailing list
> es-discuss at mozilla.org
> https://mail.mozilla.org/listinfo/es-discuss
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20110325/917a8973/attachment-0001.html>


More information about the es-discuss mailing list