Gloda and large IMAP stores (was: Re: Worthwhile Thunderbird projects/addons?)

Ben Bucksch ben.bucksch at beonex.com
Wed Dec 22 17:17:57 UTC 2010


On 21.12.2010 21:43, Andrew Sutherland wrote:
> 2) Gloda disk space usage.  This can be improved without too much work 
> by improving the fulltext search tokenizer to avoid indexing stopwords

(i.e. don't index "the", "a", "of".) Great idea.

> 3) Gloda databases and roaming storage.  This is a flat-out difficult 
> problem.

Why is it difficult? I think Tanstaafl is talking about Windows roaming 
profiles. Windows has 2 ApplicationData profile folders, one that roams 
(the default) and one that doesn't (the "Local" one). Mozilla directory 
service supports both, so it's just a matter of changing the default 
gloda location.

IMHO, anything that is a cache should be in the Local dir. Currently, 
gloda is a cache (not authoritative, can be reconstructed from other 
data), and it's big. So, it should just be stored in that other folder, 
and that's it.
This would also be an advantage for backup (given that the gloda file 
changes a lot and is big), not just roaming.

If gloda ever stores important data (which is only in gloda and not on 
e.g. IMAP), then IMHO it should be in a different sqlite file. Also for 
backup etc.. You already mention that below, that's good.

>> What I'd like to see is the IMAP stores moved to the Local Folders by
>> default

Exactly.

> Although I am hoping/planning to shard gloda's data into separate 
> databases

Great!

> I don't think we would break things along account lines ... If someone 
> has completely segregated home/work setups, I could see this being 
> useful.  Of course, that would also be an argument for the user just 
> using separate profiles, most likely by using separate user accounts 
> as provided by their OS.

Just FYI: Using separate OS profiles for work and private is not 
realistic. Almost impossible to use.
Separate email accounts for both are not just realistic, but very common.

> This is one of those unpleasant situations where many platforms are at 
> odds with each other because they don't completely overlap which 
> results in redundantly overlapping features.  Thunderbird is 
> cross-platform and cross-server so it can't leverage operating 
> system/environment-specific or server-specific mechanisms if it wants 
> to benefit all its users.  Likewise, it makes sense for GNOME to build 
> their own thing that is not Thunderbird specific, and for Dovecot to 
> do things that are not Thunderbird specific.

Indeed. What can help here, to some degree (if the overlap is farily 
exact), are standard APIs. For example, Beagle using a standard API to 
search Thunderbird, and Thunderbird then searches its store on behalf of 
Beagle, and just returns the results. Similar with server-side search 
and Thunderbird: If the server can search fast enough, and with enough 
meta-data, we wouldn't need the client-side store. The problems arise 
where the overlaps are close, but not exact, or semantically roughly the 
same, but technically very different. As it is, gloda is far far ahead 
of the (old-style) IMAP search feature, which isn't even indexed.

> This is one of the reasons I have been pushing for and trying to port 
> as much of our logic to reasonably well-abstracted JavaScript.

Thank you for that. Gloda has an awesome design.

Ben



More information about the tb-planning mailing list