Thunderbird/Addressbook - VUW Project introduction

Andrew Sutherland asutherland at asutherland.org
Sun Apr 3 00:16:54 UTC 2016


On Sat, Apr 2, 2016, at 10:10 AM, neandr wrote:
>  As you see with the project name, the project will have a strong
>  relation to Mike Conley's previous project he run in 2013.
>  Unfortunately it was not finished, but it's great to have Mike on
>  board again for the University project. Please have a look on his GIT
>  Wiki[1] which has a revised approach now and will be an important
>  basis for the group of students at the School of Engineering and
>  Computer Science, Victoria University of Wellington.
 
For the persistent storage piece discussed in the wiki's high level
overview, it may be worth considering using IndexedDB from the
SharedWorker instead of having to coordinate with and maintain main-
thread code using mozStorage.
 
A quick list of advantages:
* IndexedDB is not Gecko-specific; if Thunderbird migrated to
  Electron/whatever, the pieces in the SharedWorker would likely
  continue to work.
* IndexedDB is accessible to all workers as well as all main-thread
  document contexts.  While I think having the SharedWorker be the source-of-
  truth and performer-of-heavy-lifting is good architecture, this could
  be handy in some cases.
* IndexedDB persists values using the structured clone algorithm and has
  special handling for Blobs/Files that could be really handy for things
  like storing pictures or other large pieces of data that are only
  needed on-demand.
** For example, when reading a contact record that contains a picture
Blob/File reference, only a reference to the File is loaded.  So you
don't load the 100K image contents nor do you need to re-serialize them
when postMessage()ing the blob to page contexts from the worker.  And in
fact, you can just use URL.createObjectURL() to hand the Blob directly
to an <img> tag (and then subsequently revokeObjectURL once the load
event fires.)
** This kind of thing would arguably be a nightmare to deal with
manually using SQLite/mozStorage.  If you read the IndexedDB source
code, a large amount of it is all of the Blob handling and reference
counting and such to make sure the files don't get orphaned/etc.
* IndexedDB now has some string collation smarts so if you're doing some
  of the searching using the database, it can help you with that.
* There's some prior art from the Firefox OS mozContacts implementation
  that can be used for ideas.  See
  https://dxr.mozilla.org/mozilla-central/source/dom/contacts/fallback/ContactDB.jsm
  for its IndexedDB bits and
  https://developer.mozilla.org/en-US/docs/Mozilla/Firefox_OS/API/Contacts_API
  for some contacts.  That API is going away and has a variety of
  lessons learned (like don't try to be everything to everyone and have
  N indices that may or may not be useful to the apps that want to
  consume it), so don't depend on the existing implementation or treat
  it as perfect!
 
Things that might seem like disadvantages but aren't really:
* Gecko's SQLite does build with the Full-Text Search (FTS) extension
  enabled, but its tokenizing infrastructure is limited and really the
  main benefit is just that the inverted-index is very cleverly encoded
  so there's a fair amount of compression.  The address book's contents
  will likely constitute several orders of magnitude less data than mail
  full-text search, so I don't think this matters that much.  The
  smaller data set means tricks like generating prefixes/substrings of
  names/etc. so IndexedDB range queries can be used for search.
* SQLite does allow for cool SQL joins and such, but 1) clever
  indexes/specialized object stores can accomplish that for IndexedDB
  and 2) gloda exists and already understands multiple email addresses
  per contact and all that, so gloda can probably be used in those cases
  already or with some minor enhancements to add some attributes to the
  addressbook indexer, etc.
 
If people decide to go with using SQLite and mozStorage and all that, it
might be worth looking to integrate with gloda.  Although gloda's main
database is intentionally just an index for a variety of reasons, it's
got existing mozStorage ORM (Object-Relational Mapping) bits with
cascading queries and could also be used to store canonical data in
separate databases/etc.
 
Andrew

Links:

  1. https://github.com/mikeconley/thunderbird-ensemble/wiki
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/tb-planning/attachments/20160402/efb0e04a/attachment.html>


More information about the tb-planning mailing list