post TB 3.1 mailnews backend plans
Pidgeot18 at gmail.com
Sun Jul 18 00:58:38 UTC 2010
This one is in response to the Mork->SQLite conversation:
> Asuth has suggested using a very simple schema, where each row
> consists of a key and a json-like blob representation of a msg hdr.
A new database would need the following things:
1. Easy to add message metadata.
2. Different kinds of keys for account types. Message keys may be
defined as uint32s, but not all account types can map to a uint32. For
example, Giganews has moved to 64-bit keys for binary newsgroups (!),
and my webforum work may need to eventually key off of thread URLs
instead of being able to use an integer. For new account types, then,
being able to lookup by the "real" keys may be useful. This need not be
in the main message table though; an auxiliary table is probably fine.
In any case, what exactly constitutes a message key may need to be
3. Likewise, messages are not going to be only searched for by message
keys. Subjects and message-IDs are also pseudo-canonical mappings for
messaging, and searching by the from is probably useful for aggregation.
Finally, a date column may be useful for limiting query (e.g., delete
all messages older than 90 days, get the last 10 days of messages).
Views also rely on some flags (particularly, unread, killed, and watched).
4. For many account types, I think threads or conversations (which is
essentially a thread without hierarchy) are more important than
individual messages. That is to say, in my web forums work, I need to
key some stuff off of threads instead of messages. Any new database
schema needs to be able to attach metadata to threads.
> We'd probably have one sqlite db per account, though gloda seems to be
> able to get everything into one db.
Per-account DBs open up the following questions:
1. Would it be possible to get rid of the message folder cache? Opening
up a 100 msfs just to get folder infos is untenable, but I don't think
even the most power users have more than two dozen or so accounts. How
expensive is opening up a large SQLite database to get a single table?
Alternatively, the reference data for the folder info could be moved
into a single file.
2. If metadata is moved to an account level, it is possible to
deemphasize folder structure for those accounts which lack strong
concepts of folders.
3. Furthermore, this makes it possible to create messaging equivalents
of symbolic or hard links. Crossposting desperately wants this kind of
linking functionality, at the very least. Similarly, this would make it
easier to create "tag folders" while reusing most of the folder
hierarchy. It also seems possible to implement intraccount virtual
folders as symlinked folders, with a clever enough listening structure.
1. If we're modifying the database, one thing I would love to propose is
that we move threading to before the filtering step and therefore
allowing filters to modify thread information without causing the
database to spaz. I would also love to see changes to filter action
models, but that's another story.
2. Database interactions would also need to be specified more clearly in
general. Going along with the synchronous versus asynchronous concerns,
some methods on auxiliary interfaces (like nsIMsgDBHdr) are immediately
reflected in the database and others are not. If we want so support
asynchronous better, it may make sense to make nsIMsgDBHdr a snapshot of
3. Again, some methods have the option of not informing listeners of
database changes. Some account types need to synchronize database
changes externally (arguably, so does something like Thunderbird Sync),
so either non-informing needs to go away or some sort of superlistener
functionality needs to be added. There is one edge case which kind of
throws a wrench in things, and that is filtering (including Junk mail
filtering), or any new message processing step. Combining with 1, I
think there is a need to redesign how new messages get handled,
especially when we want to start enticing extension authors to create
new account types.
4. As important as asynchronous versus synchronous concerns is being
able to sanely use the database from multiple threads. Import and biff
probably want to access from a different thread (especially if people
set up body filters); offline message download also probably wants it,
but I don't know how much metadata access it needs.
5. Offline actions (for things which need to synchronize state to online
servers) and undo/redo actions probably want to be more generic and easy
for new account types to hook into. Seeing as a large portion of these
are essentially database actions, it may make sense to key these into
News submodule owner
More information about the tb-planning