Summary of i18n ad hoc group meeting - 21st March 2011

Nebojša Ćirić cira at google.com
Fri Mar 25 13:34:58 PDT 2011


Looking through the notes from the meeting I also found some problems with
the collator. We did specify the collatorType: search, but we didn't offer a
function that would make use of it. Mark and I are thinking about:

/**
 * string - string to search over.
 * substring - string to look for in "string"
 * index - start search from index
 * @return {Array} [first, last] - first is index of the match or -1, last
is end of the match or undefined.
 */
LocaleInfo.Collator.prototype.find(string, substring, index)

I'll start different tread about collation problems...

1.       “Use Unicode identifier vs. BCP47 in the API” It isn’t clear what
> you mean by this. I would strongly prefer that we use BCP 47 identifiers. If
> you mean “allow the Unicode locales extension to BCP 47”, I’m fine, but I
> don’t see why we would want to use the older LDML identifiers at all.
>
I mean the superset of BCP47, or what you say "the Unicode locales extension
to BCP 47".


> 2.       For Collator, I’d like an option “ignoreSpaces”. I know that the
> default implementation of ICU’s collator ignores spaces, but I have a number
> of real-world cases in which I want the opposite behavior (in which I tailor
> my collator using rules).
>
> Particularly important to me is the sorting of book titles. If I have two
> strings: “The New Fishing Bible” and “The Newer Fishing Bible”, I want to
> sort them in the order given, but most collators ignore spaces (comparing
> the second ‘e’ in Newer to the ‘F’ in Fishing). So I want the option of
> making whitespace meaningful for collation.
>
> Since most implementations are similar to ICU, though, maybe this would be
> the one “strange” option (“compareWhitespace”???)
>
I'll let Mark and Jungshik comment on this.


> 3.       “Start with ICU patterns and see if they work for everybody” I
> support establishing the pattern language early on. I need to be able to use
> Message, Choice, and Plural formatters and it’s all too common to need to
> specify particular patterns in those cases (for substitutions), not to
> mention direct usage in number and date formatters. The symbols used by
> ICU/CLDR are common enough and, if incomplete for a given implementers needs
> (Microsoft?), could be appropriately extended. Since these symbols are also
> used by Java, PHP, and some other languages, it would be unsurprising to use
> them in ECMAScript.
>
In general I agree that we should settle on the format quickly (i.e. accept
ICU patterns). I'll wait for others to pitch in with possible concerns.

New quasis proposal for string formatting may help with plural/gender rules
and general formatting.

I don't like how ICU plural/gender lets you embed formatters for
date/numbers in the message. It complicates already complex message. I would
prefer if we just placed the burden on the developer to call proper
formatters like:

"Some {NUM_KITTENS, plural, ... { Just put a {param} here}}", dtf.format(new
Date(now));

4.       “Allow short/long dateType to get value from the system or
cloud”I’m not sure I understand this one. Was this the part of the
conversation in
> which we allowed the pattern to be garnered either from the system (e.g. the
> Regional Options control panel on Windows) or from the implementation??
>
This is where the implementation fetches user settings from the system (as
you said Regional Options). I think we should word it as "Implementation may
fetch user settings from the OS to format short and long date style.".
I would be happier if we actually accepted the pattern for short/long date
from the app - in which case app developer could decide to fetch the user
settings from system or somewhere else...
 We also need to put a guidance about what would implementation do if the
system locale doesn't match with the current page locale.

>
>
>
>
>
>
> *From:* Nebojša Ćirić [mailto:cira at google.com]
> *Sent:* Monday, March 21, 2011 5:29 PM
> *To:* es-discuss at mozilla.org
> *Subject:* Summary of i18n ad hoc group meeting - 21st March 2011
>
>
>
> Started implementation of collator in Chrome and hit a problem:
>
>
>
> var coll = locale.collator();
>
> array.sort(coll.compare);
>
>
>
> Compare method gets bound to the undefined or global object at the call
> site. Erik mentioned that this problem will be solved in Harmony by passing
> additional "this" parameter to for-each and likes. We would like to propose
> extending this syntax to the sort method too.
>
>
>
> We discussed each part of the API in order to get detailed parameters of
> each constructor and method.
>
>
>
> General:
>
>    - Add options property to each class that would give you actual value
>    for the user parameters. For example, if user asked for islamic calendar,
>    and we only have islamic-civil, we set calendar property to islamic-civil.
>    Allows developer to iterate until satisfied with the result.
>    - Use Unicode identifier vs. BCP47 in the API
>
> Collator:
>
>    - numeric - specifies numeric sort (9 comes before 12)
>    - ignoreVariants - ignore all of case, width and kana
>    - ignoreWidth, ignoreCase and ignoreKana - subvariants we may implement
>    to fine tune the behavior
>    - ignoreAccents - ignore accents
>    - ignoreSymbols - ignore punctuation and symbols
>    - variant - phonebook, ... - string
>
> NumberFormat:
>
>    - Allow patterns to specify grouping, currency symbol position and sign
>    location
>    - Start with ICU patterns and see if they work for everybody
>    - Don't support overrides for grouping separator and decimal point for
>    now
>    - Specify both currencySymbol and currencyCode as override
>
> DateTimeFormatSymbols:
>
>    - Added Era and day period methods (AM/PM)
>    - Moved all methods to DateTimeFormat class
>    - Remove DTFSymbols class
>
> DateTimeFormat:
>
>    - Specify calendar names better (move work to Unicode/LDML and point to
>    their document).
>    - Allow short/long dateType to get value from the system or cloud.
>    - .options[skeleton] should contain best match for the given skeleton
>
> --
> Nebojša Ćirić
>



-- 
Nebojša Ćirić
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20110325/6a944c26/attachment.html>


More information about the es-discuss mailing list