Collation API not complete for search

Nebojša Ćirić cira at google.com
Fri Mar 25 14:12:27 PDT 2011


find method wouldn't return boolean but an array of two values:

myCollator.find('gaard', 'ard', 2) -> [2, 5]  // 4 or 5 as a bound
myCollator.find('ard', 'ard', 0) -> [0, 3]  // 2 or 3 as a bound

I guess [2, 5] !== [0, 3]

We could return [-1, undefined] for not found state, or just undefined.

I agree that returning a boolean makes for easier tests in loops.

25. март 2011. 14.00, Mike Samuel <mikesamuel at gmail.com> је написао/ла:

> 2011/3/25 Nebojša Ćirić <cira at google.com>:
> > Looking through the notes from the meeting I also found some problems
> with
> > the collator. We did specify the collatorType: search, but we didn't
> offer a
> > function that would make use of it. Mark and I are thinking about:
> > /**
> >  * string - string to search over.
> >  * substring - string to look for in "string"
> >  * index - start search from index
> >  * @return {Array} [first, last] - first is index of the match or -1,
> last
> > is end of the match or undefined.
> >  */
> > LocaleInfo.Collator.prototype.find(string, substring, index)
> > We could also opt for iterator solution where we keep the state.
>
> Assuming find returns a falsey value when nothing is found, is it the
> case that for all (string, index) pairs,
>
> !!myCollator.find(string, substring, index) ===
> !!myCollator.find(string.substring(index), substring, 0)
>
> This would be false if the substring 'ard' should be found in 'gard',
> but not 'gaard' because then
>
>     !!myCollator.find('gaard', 'ard', 2) !== !!myCollator.find('ard',
> 'ard', 0)
>
>
> If that relation does not hold, then exposing find as an iterator
> might help prevent a profusion of subtly wrong loops.
>
>
> > The reason we need to return both begin and end part of the found string
> is:
> > Look for gaard and we find gård - which may be equivalent in Danish, but
> > substring lengths don't match (5 vs. 4) so we need to tell user the next
> > index position.
> > The other problem Jungshik found is that there is a combinatorial
> explosion
> > with all ignoreXXX options we defined. My proposal is to define only N
> that
> > make sense (and can be supported by all implementors) and fall back the
> rest
> > to some predefined default.
>
>
>
> > --
> > Nebojša Ćirić
> >
> > _______________________________________________
> > es-discuss mailing list
> > es-discuss at mozilla.org
> > https://mail.mozilla.org/listinfo/es-discuss
> >
> >
>



-- 
Nebojša Ćirić
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20110325/4e31ea0e/attachment.html>


More information about the es-discuss mailing list