Globalization API Feedback - moar!

Mark Davis ☕ mark at macchiato.com
Mon Nov 28 16:01:51 PST 2011


Good point: user's customizations/preferences may also change results.

Mark
*— Il meglio è l’inimico del bene —*
*
*
*
[https://plus.google.com/114199149796022210033]
*



On Mon, Nov 28, 2011 at 22:10, Shawn Steele <Shawn.Steele at microsoft.com>wrote:

>  Re: “Changes” I’d like to also include “Changes between users”.  Some
> users may prefer variants that aren’t normal for their language for
> whatever reason.  So a system that is nominally based on ICU or another
> consistent behavior could still show variations if users are allowed to
> select their own preferences.  ****
>
> ** **
>
> -Shawn****
>
> ** **
>
> *From:* es-discuss-bounces at mozilla.org [mailto:
> es-discuss-bounces at mozilla.org] *On Behalf Of *Mark Davis ?
> *Sent:* Monday, November 28, 2011 1:13 PM
> *To:* Nebojša Ćirić
> *Cc:* es-discuss at mozilla.org
> *Subject:* Re: Globalization API Feedback - moar!****
>
> ** **
>
> Some feedback on the API. This is a bit of stream-of-consciousness
> response, but figured it would be better to get it out than to delay &
> clean it up.****
>
> ** **
>
> The internationalization issues that people may not be used to are:****
>
>    - *Big data requirements. *A collation sequence for Chinese, for
>    example, is quite large.****
>    - *Changes over time. *There are improvements all the time. In CLDR,
>    for example, there is an increase in data by typically 30-50% each year.
>    These can be additions of data for the less-well supported languages (say,
>    Uzbek), or fixes in data.****
>    - *Changes between platforms. *The collation for German on an iPhone
>    (which uses ICU) may differ from one for German on Windows 8, yet both can
>    be completely satisfactory for German users. That may be because the
>    characters that differ (say punctuation) have no fixed user expectations
>    among Germans, or it may be that there are well known acceptable
>    alternatives (phonebook vs dictionary sorting).****
>    - *Variants. *Implementations typically support a main language (such
>    as French or Uzbek), with deltas for some set of variants (Canadian French,
>    Belgian French, ...  Cyrillic Uzbek, Arabic Uzbek,...). But the exact set
>    depends on the implementation (and version). Especially in the case of
>    variants, one service (eg collation) might have no difference between a
>    variant and the principal language, while another service (date
>    formatting) might have a significant difference.****
>    - *Best Fit. *For the majority of implementations, it is far better to
>    return a "best match" than wrong language. So if the request is for French
>    (Canadian) collation, and the best available is French Belgian, then it is
>    best to return that (rather than some system default, like Japanese).
>    However, the caller may need to know exactly what the fallback was, in case
>    some actions do need to be taken.****
>    - *Initialization overhead. *For many of the i18n services, but
>    especially collation, there is a need for individual comparisons to be as
>    fast as possible. The actual mechanics of how to do this across languages
>    are far more complicated than most people realize, so typically you build a
>    service object that allows you to do the fastest job for the given set of
>    options. When no more operations will be done with the service object, it
>    can be tossed. That way the caller can determine the appropriate time to
>    jettison the object. Think of it, if you will, like how a file system
>    works. Typically you do something like ****
>
>
>     - file = open(name);****
>       - while (true) {handle(file.readByte());}****
>       - file.close();
>       An alternative would be not having open/close calls, and depending
>       on the OS caching which files are open or not.****
>       - while(true) {handle(readByte(name));}
>       But that would be very cumbersome to support in practice.****
>
>     Locale.isLocaleSupported(code)****
>
> ** **
>
> For these reasons, isLocaleSupported doesn't really work right. There can
> be different levels of support for different services, and even for the
> same service I can get back a result which isn't precisely what I asked
> for, but is sufficient for my web application. For example, if I were to
> create a given service (a collator for example) for a given locale (say
> German for Austria), with a given set of parameters (such as phonebook
> order), I might not get an exact match for what I requested: I could get a
> collator for German for Germany with phonebook order, or a collator for
> German for Austria with dictionary order. That's why the current API
> returns not only the service, but also the set of parameters for the best
> fit match.****
>
> ** **
>
> It would have to be something like****
>
> ** **
>
>  resultingOptions = Locale.getCollationSupport(options)****
>
>  or****
>
>   resultingOptions = Locale.getSupport("collation", options)****
>
>  ** **
>
> That would have to go through the logic for figuring out the best match
> for the options that you would use when creating the service, but then just
> not create the actual service.****
>
> ** **
>
>    locale.compare("foo", "bar");****
>
> ** **
>
> This can be done, but means that for performance, internally there will be
> a service object for collation that needs to be cached and managed.  Or
> maybe performance isn't a concern in an EcmaScript context.****
>
> ** **
>
> Mark****
>
> *— Il meglio è l’inimico del bene —*****
>
> ** **
>
> [https://plus.google.com/114199149796022210033]****
>
>
>
> ****
>
> On Mon, Nov 28, 2011 at 17:15, Nebojša Ćirić <cira at google.com> wrote:****
>
> It's my fault - I read "discussed with Norbert" as if you and Norbert
> agreed on this approach.
>
> We are essentially choosing beween two approaches:****
>
> ** **
>
> 1. Create an object, query its properties, use object methods (original
> approach)****
>
> 2. Create Locale object, call various methods like isSupportedXXX, if
> service is supported use formatYYY or compare
>
> They are very similar in what the user has to do (query capabilites, call
> formatters) to get the final result, so I don't think we would lose any
> functionality picking any of those.****
>
> ** **
>
> With 1. we might end up having lots of isSupportedDate,
> isSupportedNumber... after couple of iterations of adding new features
> (like segmentation, calendars, spell check...), but if we follow the same
> naming style I don't see a problem with that.****
>
> ** **
>
> As for the namespace issue, I don't see much difference between Locale and
> Globalization. We discussed Modules on the other thread and came up with:*
> ***
>
> ** **
>
> Object.system.load('@g11n', callback()) {}****
>
> ** **
>
> or synchronous call****
>
> ** **
>
> var global = Object.system.load('@g11n') {****
>
>   return __Globalization__;****
>
> }
>
> This would become module global import '@g11n' in the future. This
> approach eliminates the need of finding proper name for the namespace (and
> possibly for Locale()?).****
>
> ** **
>
> Thank you for helping out. Your proposal is exactly what we need at this
> point - to help us refine our work and make it palatable to TC39 members :).
> ****
>
> ** **
>
> 24. новембар 2011. 16.47, Nicholas C. Zakas <standards at nczconsulting.com>
> је написао/ла:****
>
> ** **
>
> Again, my apologies - I didn't mean to imply that Norbert agreed with any
> of this, just that a few ideas have been more solidified in my mind after
> speaking with him.
>
> As a web developer who has built large-scale web sites that have been
> internationalized to dozens of countries, my main purpose in contributing
> to this discussion is to provide feedback on what I would have liked to see
> in such an API to make it useful to me.
>
> The current proposal doesn't feel very JavaScript-like, and so I've been
> trying to offer alternatives that make it more JavaScript-like and,
> therefore, more likely to be used by more developers. I'm a bit concerned
> that design decisions seem to have been guided by considering the most
> complex use cases instead of the most common.
>
> It is my opinion (and I can only speak for myself) that a single object to
> encompass  would represent a better API for JavaScript than adding a
> namespace, which hasn't been done to this point, and several new types, all
> of which just do one thing. That's a very Java-like approach, and I think
> JavaScript deserves better.
>
> As I told Norbert, I'm very happy to lend my experience and insights to
> this process. I realize I may end up bringing things up that you all have
> discussed before - but considering that you did have a single Locale object
> at one point, I'd like to claim "great minds" think alike and continue
> discussing it. :)
>
> Happy Thanksgiving.
>
> -Nicholas****
>
>
>
>
> On 11/24/2011 2:44 PM, Norbert Lindenberg wrote:****
>
> I didn't agree with this approach, and Nicholas didn't claim that I did :-)
>
> I'm very glad though that Nicholas is taking the time to provide feedback,
> come up with his own ideas, and discuss them with us. In the end, the
> Globalization API can only be successful if people like him are comfortable
> using the API in their projects, and explain it to others so that they're
> comfortable doing so. Right now, the feedback from him, Rick, several TC39
> members, and others indicates significant discomfort, so we have some work
> to do. Some of that work may be changes to the API, but some may also be
> better explanation of how to use the API, directly from applications or in
> higher-level libraries.
>
> More after Thanksgiving.
>
> Norbert
>
>
> On Nov 23, 2011, at 15:15 , Nebojša Ćirić wrote:****
>
>
> 23. новембар 2011. 14.32, Nicholas C. Zakas<standards at nczconsulting.com>
>  је написао/ла:
> On 11/23/2011 12:57 PM, Nebojša Ćirić wrote:****
>
> Similar approach was proposed (with locale as a top object, others under
> it) and I have nothing against it, but there are some issues with your
> approach:
>
> (code == localeID)****
>
> Sorry for being unclear - I didn't intend for this to be a complete
> alternate proposal, just a starting point. There are definitely still
> issues that would have to be resolved.
>
> I just feel we are going in circles sometimes :). I am surprised Norbert
> agreed with this approach - I think he was against top level Locale object.
> ****
>
> 1. An implementation may support NumberFormat for localeID x, but not
> support DateFormat for x (it would need to do a fallback to less specific
> one, or default). That's why we have supportedLocaleOf method on each
> object.****
>
> So what you're saying is that there needs to be some way to feature detect
> support for number and date formats separately. That could be handled in
> any number of ways. One that pops to mind would be
> isDateFormatSupported()/isNumberFormatSupported() as an instance method.
>
> That would probably work. We could add more methods in the future - say
> one that tells you closest locale to the current one that does support
> service in question.****
>
> 2. How do you convey status of option/locale resolution to the developer?
> Which options were resolved and to what value (say I ask for 'islamic'
> calendar, but we only have 'gregory' for a given locale). In our current
> proposal we expose resolvedOptions accessor on i.e. DateTimeFormat object
> instance that has 'calendar' property, so a developer can decide what to
>       do.****
>
> Thanks, I was having trouble understanding what resolvedOptions was used
> for. Could the use case be handled by having a similar object on a Locale
> instance? It seems like you could include options for available calendars
> and anything else that developers could query against, such as:
>
>
>     var locale = new Locale();
>     if (locale.supportedOptions.islamicCalendar){
>         //foo
>     }
>
> You could also go a more traditional direction (at least in terms of DOM
> objects), by doing something like:
>
>     Locale.CALENDAR_ISLAMIC = 1;
>     Locale.CALENDAR_GREGORIAN = 2;
>
>
>     var locale = new Locale();
>     locale.isSupported(Locale.CALENDAR_ISLAMIC);
>
> I think feature detection is an easily solved problem if everything else
> is in place.
>
> Sometimes options can influence each other. For example:
>
> 1. Ask for th locale (Thai)
> 2. There are two calendars available - buddhist and gregory
> 3. There are two numbering systems available - thai and latin
>
> but only buddhist + thai and gregory + latin combinations are supported.
>
> If you ask locale.isSupported('calendar': 'buddhist') you'll get true. If
> you ask locale.isSupported('numberingSystem', 'latin') you'll get true
> again. If you try to format date using that combination (thai + latin)
> you'll get something you didn't expect.
>
> I would propose sligthly different isSupported method:
>
> locale.returnSupported(serviceName, options), where serviceName is one of
> 'dateFormat', 'numberFormat', 'collator', options object contains requested
> settings (calendar, numbering system, collation options...) and method
> returns the object with supported features for a given service.****
>
> 3. This approach would require internal caching of
> collator/dateformatter/numberformatter objects.****
>
> That's an implementation detail. I'm more interested in defining an usable
> and relatively intuitive API before worrying about optimization.
>
> I agree, it's implementation detail, but you will need to pass format
> parameter to each call of .format() method. I don't think that's a big
> problem. Also I would move format and option parameters to the last
> position (can be optional), so that user can specify only value and rely on
> defaults.****
>
> 23. новембар 2011. 12.09, Nicholas C. Zakas<standards at nczconsulting.com>
>  је написао/ла:
> After meeting with Norbert to discuss the use cases and design decision
> rationale, I've come to a different understanding of the goals of the
> globalization API. Some things I learned:
>
> 1. Augmenting native types with some default locale support may be
> dangerous. Consider the case where a single web page displays two modules
> with different locales. Which one wins? Therefore, "default" locale
> behavior for native types is impractical.
> 2. Locale information is most frequently used for formatting numbers and
> dates as well as comparing strings. The locale information doesn't permeate
> the entire execution context.
> 3. Developers are likely to want to define locale information once and
> then reuse that multiple times through a script.
>
> Given this, I'd like to propose an alternate approach to the one currently
> taken in the API and also different from my initial email. It goes like
> this:
>
> Have a single, top-level type called Locale defined as:
>
>    function Locale(code){
>
>        //whatever has to happen to process the code
>
>        this.code = code;
>    }
>
>    /*
>     * Determine if a locale is supported.
>     * @param code The code to check.
>     * @return True if supported, false if not.
>     */
>    Locale.isLocaleSupported = function(code){
>        ...
>    };
>
>    /*
>     * Replaces supportedLocalesOf
>     * @param code The code to check.
>     * @return Array of supported locales.
>     */
>    Locale.getSupportedLocales = function(code){
>        ...
>    };
>
>    /*
>     * Replaces Globalization.Collator
>     * @param a The first item.
>     * @param b The second item.
>     * @param options (Optional) The options to use when comparing.
>     * @return -1 if a comes before b, 0 if they're equal, 1 otherwise
>     */
>    Locale.prototype.compare = function(a, b, options){
>        ...
>    };
>
>    /*
>     * Replaces Globalization.NumberFormat
>     * @param format A pattern format string for outputting the number.
>     * @param value The value to format.
>     * @return The number formatted as a string.
>     */
>     Locale.prototype.formatNumber = function(format, value){
>        ...
>    };
>
>    /*
>     * Replaces Globalization.DateFormat
>     * @param format A pattern format string for outputting the date.
>     * @param value The date to format.
>     * @return The number formatted as a string.
>     */
>    Locale.prototype.formatDate = function(format, value){
>        ...
>    };
>
> You would then be able to create a single Locale instance and have that be
> used in your script. If the constructor is used without an argument, then
> default locale information is used:
>
>    var locale = new Locale();
>
> If you provide a code, then that is used:
>
>    var locale = new Locale("en-us");
>
> If you provide multiple codes, then the first supported one is used:
>
>    var locale = new Locale(["en-us", "en-gb"]);
>
> Then, you can use that locale information for the other operations you
> want to do:
>
>    locale.formatDate("DMYs-short", new Date());
>    locale.formatNumber("##.##", 55);
>    locale.compare("foo", "bar");
>
> By the way, not saying this is the format pattern string that should be
> used, it's just for discussion.
>
> I like having a single object to deal with instead of multiple for
> everything the API is trying to do. It seems a lot more intuitive than
> needing to manage a LocaleList that is passed into new instances of
> NumberFormat and DateFormat all the time (that's a bunch of housekeeping
> for developers).
>
> Thoughts?
>
> -Nicholas
>
>
>
>
> On 11/21/2011 11:12 AM, Nicholas C. Zakas wrote:
> As promised, more verbose feedback for the Globalization API. My general
> feeling is that the API is overly verbose for what it's doing. I'll state
> my bias up front: I'm not a fan of introducing a bunch of new types to
> handle formatting. I'd much rather have additional methods that perform
> formatting on existing objects. My feedback is mostly about eliminating the
> new constructors - which has an added bonus of eliminating the
> Globalization namespace because there would be only one constructor left:
> Collator.
>
> 1. LocaleList
>
> I'm not sure why this type is necessary. I don't believe that locale
> resolution is an expensive operation, and even if it is, I'd expect the
> implementation to cache the results of such resolution for later use. I'd
> just leave this as an internal construct and instruct developers to use
> arrays all the time.
>
> 2. supportedLocalesOf
>
> I find this method name strange - I've read it several times and am still
> not sure I fully understand what it does. Perhaps "getSupportedLocales()"
> is a better name for this method? (I always prefer methods begin with
> verbs.)
>
> 3. NumberFormat
>
> Number formatting seems simple enough that it could just be added as a
> series of methods on Number.prototype. The three types of formatting
> (currency, decimal, percent) could each have their own method. Currency
> formatting has relatively few options to specify, so it's method can be:
>
>    /*
>     * Formats the number as if it were currency
>     * @param code Currency code, e.g., "EUR"
>     * @param type (Optional) The way to format the currency code, "code",
> "symbol" (default),
>     * @param locales - (Optional) Array of locales to use.
>     */
>    Number.prototype.toCurrencyString = function(code, type, locales) {
>        ...
>    };
>
>    var num = 500;
>    console.log(num.toCurrencyCode("EUR", "code"));    //"EUR 500.00"
>
>
> Decimal and percent formatting options are slightly different in that they
> include significant digits options. For that, I prefer to use a formatting
> string rather than the multitude of optional properties as currently
> defined (see http://www.exampledepot.com/egs/java.text/FormatNum.html).
> The formatting string indicates must-have digits as 0 and optional digits
> as #, allowing you to very succinctly specify how you want your number to
> be output. For example:
>
>    /*
>     * Formats the number as a decimal string.
>     * @param format Format string indicating max/min significant digits
>     * @param locales (Optional) Array of locales to use.
>     */
>    Number.prototype.toDecimalString = function(format, locales){
>        ...
>    };
>
>    /*
>     * Formats the number as a percent string.
>     * @param format Format string indicating max/min significant digits
>     * @param locales (Optional) Array of locales to use.
>     */
>    Number.prototype.toPercentString = function(format, locales){
>        ...
>    };
>
>    var num = 1234.567;
>    console.log(numtoDecimalString("000##.##")); "01234.57"
>
> 4. DateTimeFormat
>
> As with NumberFormat, it seems like this could more succinctly be
> implemented as a method on Date.prototype. As its easiest:
>
>    /*
>     * Format a date
>     * @param options The already-defined options for DateTimeFormat
>     * @param locales (Optional) Array of locales to use.
>     */
>    Date.prototype.toFormatString = function(options, locales){
>        ...
>    };
>
> In an ideal world, I'd like to see options overloaded so it can be an
> options object as specified now or a formatting string. I understand that
> there was a sentiment against formatting strings due to their limitations
> and edge case errors. However, I'd like to point out that any
> internationalized web application is highly likely to already be using
> formatting strings for dates, since this is pretty much how every other
> language handles date formatting. That means supporting format strings in
> JavaScript would allow application developers to reuse the settings they
> already have. As it stands now, you'd need to create two different ways of
> formatting dates for a web app: one for your server-side language and one
> for your client-side language (until the day everything is running on
> Node.js, of course). I'd prefer my client-side code to reuse settings and
> configuration that the server-side code uses, otherwise I end up with two
> very different pieces of code doing the exact same thing, and there be
> dragons.
>
> -Nicholas
>
> _______________________________________________
> es-discuss mailing list
> es-discuss at mozilla.org
> https://mail.mozilla.org/listinfo/es-discuss
>
>
>
> --
> Nebojša Ćirić****
>
>
>
>
> --
> Nebojša Ćirić
> _______________________________________________
> es-discuss mailing list
> es-discuss at mozilla.org
> https://mail.mozilla.org/listinfo/es-discuss****
>
> ** **
>
>
>
> ****
>
> ** **
>
> --
> Nebojša Ćirić****
>
>
> _______________________________________________
> es-discuss mailing list
> es-discuss at mozilla.org
> https://mail.mozilla.org/listinfo/es-discuss****
>
> ** **
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20111129/66f19bbc/attachment-0001.html>


More information about the es-discuss mailing list