Invitation for technical discussion on next-generation Thunderbird (Semantic Desktop)

Paul D. Fernhout pdfernhout at kurtz-fernhout.com
Fri Apr 28 02:04:59 UTC 2017


On 2017-04-27 3:09 PM, Tito wrote:
>> So, for me, a good assumption is that people will eventually have
>> billions of message items comprising terabytes of data in their local
>> (shared-with-email) store. Eventually each user would be part of a
>> global federation of users that together have billions of times that
>> number of messages as a federated semantic web.
> 
> Paul,
> 
> so if we thing about implementation perspective what database do you
> think should be implemented i.e. embedded PostgreSQL, Mysql . I do not
> believe that IndexedDB can coup with that requirement. The only think
> that will most likely do it is hash table (Mork) but that is what we all
> (especially Joshua) wants to kill. So i am kind of curious what database
> are you proposing to be running in the background in order to support
> those "terabytes of data"?
> 
> Tito

Hi Tito,

As I said in this bug I filed on Firefox three years ago, in a followup
comment after it had sat around for a year with no interest:
"IndexedDB same-origin policy implementation for local files with query
string"
https://bugzilla.mozilla.org/show_bug.cgi?id=1005634
"In general, and given that this (in my opinion) bug has been sitting
around for so long (both on the user side and on the Mozilla side), it
seems to me this situation relates in part to changing cultural
expectations on the use of a web browser. For me, I increasingly see the
web browser with JavaScript as a new non-proprietary well-supported
cross-platform technology to deliver applications of all sorts for the
desktop, mobile, and embedded (a bit like the proprietary VisualWorks
Smalltalk could do in the 1980s way before Java). I can think that even
if at the same time I feel we should have better standards for
exchanging information in structured ways. To me, the app part of that
means a web browser should fully support running applications from local
files including all functionality -- but in a "sandbox" with
fine-grained security permissions (something any OS should ideally be
supporting from the ground up for all apps and subapps, but that's
another story). Full functionality could include support for
peer-to-peer web browser interactions without the need for a central
server (like WebRTC moves towards). However, I get the feeling most
people using web browsers (including likely many at Mozilla) still see a
web browser as something always connecting to servers which host web
pages. Even Mozilla's Webmaker movement focuses on using a server to
make content, not to edit local files. ..."

That is why in my original response to Ben's proposal, I suggested
seeing TB:NG as a standard web server (e.g. Node.js) which any web
browser could connect to. I feel we have to accept the cultural limits
of web browser makers and use browsers as most people use them (i.e.
connecting to a server). Any attempt to store substantial data in the
web browser (other than as a small cache for efficiency) will likely
soon lead to heartbreak.

So, basically any scalable datastore that can run on Mac, PC, and Linux
is a possibility. I've been a big PostgreSQL fan for over a decade. I
have my own experimental "Pointrel" ideas (intended these days to be
append-only for data and also to play nice with Apple TimeMachine
backups) -- but sure, PostgreSQL, SQLite, CouchDB, Apache Accumulo
(designed in part of storing email by spooks), MongoDB, and many more
are all possibilities.

Ideally, the backend data store could even be pluggable, but whether we
could pull that off in practice is another question -- as pluggable
backends tend to sink to the lowest common denominator or otherwise
reinvent so much database technology themselves in an interface layer.
Still, ideally we could start with a pluggable approach (with say
SQLite, Postgres, and CouchDB backends) and see how far that gets us.

Basic needs are probably mainly to store and fetch content by its hash
and to support maintaining (and rebuilding) some sort of index files for
indexes made by examining the content. The indexing is the most
problematical part, especially if it was pluggable.

--Paul Fernhout (pdfernhout.net)
"The biggest challenge of the 21st century is the irony of technologies
of abundance in the hands of those still thinking in terms of scarcity."


More information about the tb-planning mailing list