Summary of i18n ad hoc group meeting - 21st March 2011
Jungshik Shin (신정식, 申政湜)
jungshik at google.com
Fri Mar 25 15:08:06 PDT 2011
On Fri, Mar 25, 2011 at 1:34 PM, Nebojša Ćirić <cira at google.com> wrote:
> Looking through the notes from the meeting I also found some problems with
> the collator. We did specify the collatorType: search, but we didn't offer a
> function that would make use of it. Mark and I are thinking about:
>
> /**
> * string - string to search over.
> * substring - string to look for in "string"
> * index - start search from index
> * @return {Array} [first, last] - first is index of the match or -1, last
> is end of the match or undefined.
> */
> LocaleInfo.Collator.prototype.find(string, substring, index)
>
> I'll start different tread about collation problems...
>
> 1. “Use Unicode identifier vs. BCP47 in the API” It isn’t clear what
>> you mean by this. I would strongly prefer that we use BCP 47 identifiers. If
>> you mean “allow the Unicode locales extension to BCP 47”, I’m fine, but I
>> don’t see why we would want to use the older LDML identifiers at all.
>>
> I mean the superset of BCP47, or what you say "the Unicode locales
> extension to BCP 47".
>
A bit of clarification: what we have in mind is "BCP 47 with Unicode locale
extension" (ref. http://www.rfc-editor.org/rfc/rfc6067.txt and
http://www.unicode.org/draft/reports/tr35/tr35.html )
I agree that we don't want to support the older LDML identifiers.
>
>
>> 2. For Collator, I’d like an option “ignoreSpaces”. I know that the
>> default implementation of ICU’s collator ignores spaces, but I have a number
>> of real-world cases in which I want the opposite behavior (in which I tailor
>> my collator using rules).
>>
>> Particularly important to me is the sorting of book titles. If I have two
>> strings: “The New Fishing Bible” and “The Newer Fishing Bible”, I want to
>> sort them in the order given, but most collators ignore spaces (comparing
>> the second ‘e’ in Newer to the ‘F’ in Fishing). So I want the option of
>> making whitespace meaningful for collation.
>>
>> Since most implementations are similar to ICU, though, maybe this would be
>> the one “strange” option (“compareWhitespace”???)
>>
> I'll let Mark and Jungshik comment on this.
>
>
>
>> 3. “Start with ICU patterns and see if they work for everybody” I
>> support establishing the pattern language early on. I need to be able to use
>> Message, Choice, and Plural formatters and it’s all too common to need to
>> specify particular patterns in those cases (for substitutions), not to
>> mention direct usage in number and date formatters. The symbols used by
>> ICU/CLDR are common enough and, if incomplete for a given implementers needs
>> (Microsoft?), could be appropriately extended. Since these symbols are also
>> used by Java, PHP, and some other languages, it would be unsurprising to use
>> them in ECMAScript.
>>
> In general I agree that we should settle on the format quickly (i.e. accept
> ICU patterns). I'll wait for others to pitch in with possible concerns.
>
> New quasis proposal for string formatting may help with plural/gender rules
> and general formatting.
>
> I don't like how ICU plural/gender lets you embed formatters for
> date/numbers in the message. It complicates already complex message. I would
> prefer if we just placed the burden on the developer to call proper
> formatters like:
>
> "Some {NUM_KITTENS, plural, ... { Just put a {param} here}}",
> dtf.format(new Date(now));
>
> 4. “Allow short/long dateType to get value from the system or cloud”I’m not sure I understand this one. Was this the part of the conversation in
>> which we allowed the pattern to be garnered either from the system (e.g. the
>> Regional Options control panel on Windows) or from the implementation??
>>
> This is where the implementation fetches user settings from the system (as
> you said Regional Options). I think we should word it as "Implementation may
> fetch user settings from the OS to format short and long date style.".
> I would be happier if we actually accepted the pattern for short/long date
> from the app - in which case app developer could decide to fetch the user
> settings from system or somewhere else...
> We also need to put a guidance about what would implementation do if the
> system locale doesn't match with the current page locale.
>
>
>>
>>
>>
>>
>>
>> *From:* Nebojša Ćirić [mailto:cira at google.com]
>> *Sent:* Monday, March 21, 2011 5:29 PM
>> *To:* es-discuss at mozilla.org
>> *Subject:* Summary of i18n ad hoc group meeting - 21st March 2011
>>
>>
>>
>> Started implementation of collator in Chrome and hit a problem:
>>
>>
>>
>> var coll = locale.collator();
>>
>> array.sort(coll.compare);
>>
>>
>>
>> Compare method gets bound to the undefined or global object at the call
>> site. Erik mentioned that this problem will be solved in Harmony by passing
>> additional "this" parameter to for-each and likes. We would like to propose
>> extending this syntax to the sort method too.
>>
>>
>>
>> We discussed each part of the API in order to get detailed parameters of
>> each constructor and method.
>>
>>
>>
>> General:
>>
>> - Add options property to each class that would give you actual value
>> for the user parameters. For example, if user asked for islamic calendar,
>> and we only have islamic-civil, we set calendar property to islamic-civil.
>> Allows developer to iterate until satisfied with the result.
>> - Use Unicode identifier vs. BCP47 in the API
>>
>> Collator:
>>
>> - numeric - specifies numeric sort (9 comes before 12)
>> - ignoreVariants - ignore all of case, width and kana
>> - ignoreWidth, ignoreCase and ignoreKana - subvariants we may
>> implement to fine tune the behavior
>> - ignoreAccents - ignore accents
>> - ignoreSymbols - ignore punctuation and symbols
>> - variant - phonebook, ... - string
>>
>> NumberFormat:
>>
>> - Allow patterns to specify grouping, currency symbol position and
>> sign location
>> - Start with ICU patterns and see if they work for everybody
>> - Don't support overrides for grouping separator and decimal point for
>> now
>> - Specify both currencySymbol and currencyCode as override
>>
>> DateTimeFormatSymbols:
>>
>> - Added Era and day period methods (AM/PM)
>> - Moved all methods to DateTimeFormat class
>> - Remove DTFSymbols class
>>
>> DateTimeFormat:
>>
>> - Specify calendar names better (move work to Unicode/LDML and point
>> to their document).
>> - Allow short/long dateType to get value from the system or cloud.
>> - .options[skeleton] should contain best match for the given skeleton
>>
>> --
>> Nebojša Ćirić
>>
>
>
>
> --
> Nebojša Ćirić
>
> _______________________________________________
> es-discuss mailing list
> es-discuss at mozilla.org
> https://mail.mozilla.org/listinfo/es-discuss
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20110325/917a8973/attachment-0001.html>
More information about the es-discuss
mailing list