Invitation for technical discussion on next-generation Thunderbird (Semantic Desktop: Pyramid of Users including for Privacy)
Paul Fernhout
pdfernhout at kurtz-fernhout.com
Sun Apr 30 17:01:40 UTC 2017
On 28.04.2017 09:49, Gervase Markham wrote:
> On 25/04/17 04:26, Paul D. Fernhout wrote:
>> So, for me, a good assumption is that people will eventually have
>> billions of message items comprising terabytes of data in their local
>> (shared-with-email) store.
>
> I don't think that's a good assumption at all, because the "delete"
> button exists. Who has access to their text messages from 10 years ago?
> I think I do, but then I'm a data migration geek. I doubt one person in
> 100 does. People don't want to manage this much data.
It would be helpful to clarify what you mean by "People"? All people?
Almost all people? Most people? Some people? A few people?
For example of why such clarity matters, one of the biggest mistakes
Mozilla made in the Firefox UI redesign was misunderstanding the pyramid
of users of any application. That was a mistake which is part of falling
Firefox marketshare and likely will result in upcoming massive
(negative) changes for Mozilla as search revenue contracts dry up.
Ideally Thunderbird would not repeat that same mistake.
Essentially, in very round numbers, there is a pyramid of users for any
major extensible application:
* <0.1% of an apps user base are the developers
* 1% of the user base is power users who customize the app for
themselves and makes plugins and answer deeper questions and strongly
advocate for it; they also test heavily thorugh their own use, report
bugs, and may make some improvements to the core
* 9% of users are fairly savvy and also answer simpler questions and use
at most a few of the customization features
* 90% of users just use it out of the box because it is popular and well
maintained
One of Mozilla's major mistakes with Firefox (in order to chase Chrome's
user base) was to say, essentially, we don't need to support the 1% of
users who want to customize it and heavily use it in new ways. Mozilla
decided to focus on the 99% of users because that is easier and
customization just causes confusion for some users and supporting all
the legacy customizations of the 1% would be hard (and Mobile and IoT
were the new shiny things money went into instead). And guess what --
that 1% of Firefox power users got really annoyed and started leaving
the platform (at least emotionally). And it turns out, there goes all
the volunteers who who had made Mozilla a success in the first place by
evangelizing Firefox and testing it and reporting bugs and creating an
ecosystem around it. For example, I used to install Firefox for family
and friends on desktops -- now I just tell people to buy a Chromebook.
TB:NG is facing similar design decisions right now. Is TB:NG for the
Thunderbird power users who care about maintaining Thunderbird some of
whom have archives of millions of messages? If we decide not to support
power users, then why bother with doing more than reskinning Nylas Mail
2.0?
To make a point, this is my first time posting to this list from ISP's
RoundCube webmail from my Chromebook instead of Thunderbird (although I
have used that webmail service on-and-off for years from laptops, BCCing
myself like now so Thunderbird POP on my desktop eventually gets a copy,
given Thunderbird's lack of a server component). RoundCube works for
email. It's a feasible solution for email for most people -- even if it
is not everything I want.
=== Deletion takes energy
Deletion is management! If you want to make things easy for users,
either delete everything automatically on a schedule as it comes in or
delete nothing ever. Deciding what to delete selectively is a very hard
problem involving a lot of human effort and anxiety. To what value when
disk is so cheap?
That's probably one other reason you still have all those chat messages
-- it would take more effort and emotion to delete most of them than it
does to just keep them all and search them locally when you want.
And further, as Pat Helland wrote of append-only big data but applies as
well to local data, given systems that delete stuff are more fragile:
"Accountants Don’t Use Erasers"
https://blogs.msdn.microsoft.com/pathelland/2007/06/14/accountants-dont-use-erasers/
I guess I'm naturally biased towards preservation rather than deletion.
I'm also a trustee for my historical society trustee -- and even as we
keep up some old buildings, I'm concerned about the loss of our digital
history as well and the fact that almost no one seems to care about it.
Yet, the 1% who do care (often not the financial 1%) can make a big
difference for the community.
Theodore Sturgeon (who wrote the Skills of Xanadu previously mentioned
which inspired Ted Nelson and Hypertext and so on) also coined a law:
"ninety percent of everything is crap".
https://en.wikipedia.org/wiki/Sturgeon%27s_law
I won't disagree that, say, we could make the web itself a lot easier to
use (and more accurate) for most users by deleting 90% of it. :-) But
rather than censorship, we use search engines and social processes and
semantic tagging and other tools instead. By not deleting the 90% of the
web most of us don't need, we avoid deleting the 10% (or less) of
content that really matters if we had guessed wrong. TB:NG as a
comprehensive messaging platform should adopt the same strategy.
=== Video is an increasingly bigger part of messaging
On-demand movie apps and video conferencing apps are typically
implemented through processing streams of short messages. Any regular
user can reach millions of messages handled (or stored) fairly quickly
these days. In round numbers, a 1 GB (1,000,000 K) movie or video
conference stream divided by 100K per message for a video frame is
10,000 messages. Watch 100 movies a year or have 100 video conferences
per year for ten years, and that is 10,000,000 messages handled.
Now you may say, but wait, TB:NG does not need to handle streaming video
messages. Well, that's a design choice. But why make that choice without
discussion -- especially given steaming video is an increasing part of
corporate communications and now a new feature of Slack? Why not think
about how text messaging (including annotations, summaries, chat
messages, and of course emails) interacts with video?
=== Privacy value in local stores to keep search local
And of course, why shouldn't people be able to subscribe to thousands of
busy mailing lists for years just in case to search later maybe? Why
should we assume key information will be on the web versus discussions?
My secret weapon as a software developer years ago was using DejaNews to
look up technical information from Usenet. Now I use a search engine and
look at discussions in web forums, blogs, or issue trackers. But why
should I not want to have a local copy of all Bugzilla issues for all
apps I use continuously updates via the equivalent of an RSS feed? Why
should I want to be dependent on someone else's web servers for
searching for issues of interest?
From a privacy standpoint, it makes much more sense to subscribe to lots
of lists and also download their full archives locally and search
locally than it does to give up private (even strategic) information
about what I am thinking about at the moment by using a search engine.
When I was at IBM Research around the turn of the century (among other
things, helping develop a forerunner to Siri called the IBM Personal
Speech Assistant, implementations of XML standards like XSL-FO, and a
digital video system that embedded HTML in videos), I always had to
think twice about whatever I typed into a search query as far as what
IBM Confidential information a search might disclose. Makes me realize a
potential market for a TB:NG handling billions of messages may actually
be big research companies who want to keep searches in-house.
While I can appreciate the limits of small companies in maintaining a
reliable IT infrastructure, I still don't get why many bigger companies
use Slack instead of hosting chat in-house with, say, Mattermost, or why
the smaller companies don't host Mattermost with third parties. I
outline some of the privacy risks of Slack here:
http://pdfernhout.net/reasons-not-to-use-slack-for-free-software-development.html#Slacks_privacy_policy_guarantees_very_little
Now that it looks like ISPs can sell user browsing data, thinking about
alternative ways to support privacy by just downloading everything and
searching locally seems worthwhile to me. But that means we need a
personal messaging system that scales.
https://www.eff.org/deeplinks/2017/03/five-creepy-things-your-isp-could-do-if-congress-repeals-fccs-privacy-protections
My family has a personal library of 1000s of books. One of the
surprising values in that library in a digital age is knowing that
corporations and government are not looking over my shoulder when I read
those books. Sadly I'm increasingly allergic to dust from books, so I
use a Kindle more and more -- and I don't like that situation. Why not
just download all of, say, Project Gutenberg and then choose what I read
privately on a great e-reader? And then why not download all discussions
about all those books and then privately follow the threads I am
interested in?
Also, I have been on mailing lists where email archives went away. Why
not get archives now, before, say, Google decides to paywall or delete
or even alter them?
=== organizing all our messages is worthwhile
Of course, I am also all for writing great software that helps in
prioritizing what we look at, of what we have, if we have billions of
message stored locally. But that is a different issue (and one involving
Library Science and "The Discipline of Organizing" and more).
http://disciplineoforganizing.org/
"This book changed my view of organizing as the dull, tedious task of
putting things in order into thinking of it as the marvelous study of
how people add structure to things ... Glushko is the master of the
discipline of organizing, painting the fascinating story of how
different organizational schemes change our behavior and our thoughts."
So, if organizing millions or even billions of messages a challenge?
Yes. But for the right people it is an interesting and worthwhile
challenge.
==== Most users don't care, but that does not matter
You'd be right to say 90% to 99% of users don't care about privacy.
Sigh. But as above, it is the 1% of users who do who could make TB:NG a
success by evangelizing it, testing it, fixing it, and so on.
--Paul Fernhout (pdfernhout.net)
"The biggest challenge of the 21st century is the irony of technologies
of abundance in the hands of those still thinking in terms of scarcity."
More information about the tb-planning
mailing list