Collation API not complete for search

Mike Samuel mikesamuel at gmail.com
Fri Mar 25 14:00:59 PDT 2011


2011/3/25 Nebojša Ćirić <cira at google.com>:
> Looking through the notes from the meeting I also found some problems with
> the collator. We did specify the collatorType: search, but we didn't offer a
> function that would make use of it. Mark and I are thinking about:
> /**
>  * string - string to search over.
>  * substring - string to look for in "string"
>  * index - start search from index
>  * @return {Array} [first, last] - first is index of the match or -1, last
> is end of the match or undefined.
>  */
> LocaleInfo.Collator.prototype.find(string, substring, index)
> We could also opt for iterator solution where we keep the state.

Assuming find returns a falsey value when nothing is found, is it the
case that for all (string, index) pairs,

!!myCollator.find(string, substring, index) ===
!!myCollator.find(string.substring(index), substring, 0)

This would be false if the substring 'ard' should be found in 'gard',
but not 'gaard' because then

     !!myCollator.find('gaard', 'ard', 2) !== !!myCollator.find('ard', 'ard', 0)


If that relation does not hold, then exposing find as an iterator
might help prevent a profusion of subtly wrong loops.


> The reason we need to return both begin and end part of the found string is:
> Look for gaard and we find gård - which may be equivalent in Danish, but
> substring lengths don't match (5 vs. 4) so we need to tell user the next
> index position.
> The other problem Jungshik found is that there is a combinatorial explosion
> with all ignoreXXX options we defined. My proposal is to define only N that
> make sense (and can be supported by all implementors) and fall back the rest
> to some predefined default.



> --
> Nebojša Ćirić
>
> _______________________________________________
> es-discuss mailing list
> es-discuss at mozilla.org
> https://mail.mozilla.org/listinfo/es-discuss
>
>


More information about the es-discuss mailing list