Collation API not complete for search

Mike Samuel mikesamuel at gmail.com
Fri Mar 25 14:15:00 PDT 2011


2011/3/25 Nebojša Ćirić <cira at google.com>:
> find method wouldn't return boolean but an array of two values:

Sorry if I wasn't clear.  The !! at the beginning of the call to find
is important.
The undefined value you mentioned below as possible no match result is
falsey because !!undefined === false.

> myCollator.find('gaard', 'ard', 2) -> [2, 5]  // 4 or 5 as a bound
> myCollator.find('ard', 'ard', 0) -> [0, 3]  // 2 or 3 as a bound
> I guess [2, 5] !== [0, 3]

True, but also [2, 5] !== [2, 5].

> We could return [-1, undefined] for not found state, or just undefined.

> I agree that returning a boolean makes for easier tests in loops.


> 25. март 2011. 14.00, Mike Samuel <mikesamuel at gmail.com> је написао/ла:
>>
>> 2011/3/25 Nebojša Ćirić <cira at google.com>:
>> > Looking through the notes from the meeting I also found some problems
>> > with
>> > the collator. We did specify the collatorType: search, but we didn't
>> > offer a
>> > function that would make use of it. Mark and I are thinking about:
>> > /**
>> >  * string - string to search over.
>> >  * substring - string to look for in "string"
>> >  * index - start search from index
>> >  * @return {Array} [first, last] - first is index of the match or -1,
>> > last
>> > is end of the match or undefined.
>> >  */
>> > LocaleInfo.Collator.prototype.find(string, substring, index)
>> > We could also opt for iterator solution where we keep the state.
>>
>> Assuming find returns a falsey value when nothing is found, is it the
>> case that for all (string, index) pairs,
>>
>> !!myCollator.find(string, substring, index) ===
>> !!myCollator.find(string.substring(index), substring, 0)
>>
>> This would be false if the substring 'ard' should be found in 'gard',
>> but not 'gaard' because then
>>
>>     !!myCollator.find('gaard', 'ard', 2) !== !!myCollator.find('ard',
>> 'ard', 0)
>>
>>
>> If that relation does not hold, then exposing find as an iterator
>> might help prevent a profusion of subtly wrong loops.
>>
>>
>> > The reason we need to return both begin and end part of the found string
>> > is:
>> > Look for gaard and we find gård - which may be equivalent in Danish, but
>> > substring lengths don't match (5 vs. 4) so we need to tell user the next
>> > index position.
>> > The other problem Jungshik found is that there is a combinatorial
>> > explosion
>> > with all ignoreXXX options we defined. My proposal is to define only N
>> > that
>> > make sense (and can be supported by all implementors) and fall back the
>> > rest
>> > to some predefined default.
>>
>>
>>
>> > --
>> > Nebojša Ćirić
>> >
>> > _______________________________________________
>> > es-discuss mailing list
>> > es-discuss at mozilla.org
>> > https://mail.mozilla.org/listinfo/es-discuss
>> >
>> >
>
>
>
> --
> Nebojša Ćirić
>


More information about the es-discuss mailing list