Proposal to make gloda fulltext tokenizer treat '_' as punctuation without schema bump
tanstaafl at libertytrek.org
Tue Jul 17 10:34:50 UTC 2012
On 2012-07-17 6:23 AM, Gervase Markham <gerv at mozilla.org> wrote:
> On 17/07/12 01:27, Andrew Sutherland wrote:
>> I don't like bumping the gloda schema rev because it has the very bad UX
>> of "I upgraded Thunderbird and now Thunderbird is using a lot of my CPU
>> and if I do gloda searches right now, they might not find anything".
>> The argument for making the fix and not bumping the schema is that
>> treating underscores as part of the word is arguably messed up right now.
> Are there any other schema-breaking changes on the horizon which you
> could roll in to the same update? I'd say this one is worth waiting up
> to 6 months for if we can eliminate a second change later.
> Can we make the user more informed about what's happening - e.g. a
> "Database reindexing (X% complete)" status bar message?
As one who got bit really really bad when GLODA was first implemented
and screamed bloody murder -
(I had 20+ IMAP accounts, many with many GB of messages and many with
many dozens of folders, most of which were *not* set to offline mode but
with a very few carefully configured selective offline folders defined,
and when this change happened, all of my carefully/explicitly defined
offline settings were just trashed, *all* folders set to full offline
mode, and Thunderbird was ... well, it was just toast. It took me a
while googling to figure out what happened) -
Yes, some kind of status message would imho be very desirable for
*anything* that has a potential for impacting performance.
But wouldn't it be better to simply write the index/reindex code so that
it simply doesn't and *can* not consume all CPU cycles? Is there no way
to throttle it so that it never uses more that say 20%?
More information about the tb-planning