Proposal to make gloda fulltext tokenizer treat '_' as punctuation without schema bump

Tanstaafl tanstaafl at
Tue Jul 17 19:58:49 UTC 2012

On 2012-07-17 1:15 PM, Andrew Sutherland <asutherland at> 
> On 07/17/2012 03:34 AM, Tanstaafl wrote:
>> But wouldn't it be better to simply write the index/reindex code so
>> that it simply doesn't and *can* not consume all CPU cycles? Is there
>> no way to throttle it so that it never uses more that say 20%?
> The code does use adaptive scheduling to try and detect how much
> CPU/system time it is using, as well as to notice when the system
> appears to be under load (many thanks to rkent for this!) in order to
> limit its activities so it doesn't harsh the system. Unfortunately, this
> is a tricky thing to do given the limited platform facilities at hand
> and how much stuff happens and needs to happen on the main thread in
> Thunderbird. It is possible that virus checkers are making this much
> worse on windows, but I don't have any hard numbers.

Yeah, AV is a huge unknown and one of the first things I always check 
myself when I'm having problems like this...

Hmmm... a side question then...

I can't imagine indexing just the headers would be very taxing - why not 
only index those by default, then, whenever someone does a body search, 
pop-up a box informing them that full text searches would be much faster 
when the body is indexed, but warning the user that enabling them *will* 
cause temporary performance issues until the indexes are finished 
building. I'd think this would be a relatively easy fix, bu I've been 
wrong before...

Also, even though you may not be able to show an accurate progress, you 
could still have a very visible modeless pop-up box informing the user 
that indexing is occurring, with the best effort progress bar inside it 
(but don't allow the user to totally close the window, just minimize it, 
until the operation is completed)...

More information about the tb-planning mailing list