JS-ification and code organization

Joshua Cranmer 🐧 pidgeot18 at gmail.com
Thu Jan 21 20:18:51 UTC 2016


Hello all,
I'd like to take some time to discuss the goal of converting much or 
most of our codebase to JS and the mechanisms by which we do this. I'm 
doing this now because I've had related discussions crop up a few times, 
so I think it's best to have a wider discussion so everyone can be on 
the same page.

To start with, you can divide a codebase like Thunderbird (or at least, 
the key parts of maintenance) into basically 4 categories:
1. Frontend code
2. Backend code
3. Protocol libraries
3a. Extended standard library
4. Addon code

For this post, I'd like to focus on the protocol library section. By 
protocol libraries I mean the code that knows how to talk the standard 
protocols and formats and are more or less agnostic about the structure 
of the larger application and so could be reused among many different 
applications; for example, NNTP or MIME parsing code would be a protocol 
library, while our database code is backend code. Also somewhat related 
to the protocol libraries are small libraries that add functionality 
that you might want in a standard library that doesn't exist, like 
Task.jsm or sockets in JS. It's because these libraries can be 
more-or-less platform-agnostic (by platform, I mean JS execution 
environment, not Windows/Linux/etc.) that I'm specifically calling them 
out separately for now.

Particularly in the latter half of last year, support for ES6 has 
improved tremendously. SpiderMonkey basically implements all of it, 
modulo some bugs and ES6 modules not being usable yet [1]. V8 is 
slightly further behind (missing most notably for me array destructuring 
and default function parameters, although some features are presently 
strict-mode only). Support tools like eslint or jsdoc3 now support all 
of ES6 syntax [2], a big change from when I've had to play with them a 
year ago.

At present, Thunderbird already uses two JS protocol libraries: JSMime 
and ical.js. I've also prepared a SASL library and I've poked away at an 
NNTP library that I want to try testing (to replace nsNNTPProtocol 
eventually) sometime this year; I've also poked a little bit at the 
whiteout.io (now emailjs) libraries.

After playing with several of these libraries, and trying out 
development, I've come to find that a workflow of using the 
npm/node-based tooling for rapid development works better even than the 
xpcshell-loop of development in comm-central. The standard testing 
environment for JS these days seems to be Mocha [3], and many of the 
libraries are using CommonJS-style assert for the tests. With this 
standard environment, it's easy to run via the command line, to get code 
coverage (istanbul is the tool here), and it's also fairly easy to run 
in a web browser (although getting the larger email tests of JSMime 
working required hosting a local webserver [4]), which makes it quite 
easy to use the builtin JS debugger of Firefox to debug code when 
needed. I will also point out that, among the different kinds of build 
systems, the relevant packages almost all seem to be using Grunt, 
although there is little else about their build system that's 
particularly standardized.

Now the way I'm describing this does presume that we should make it a 
goal to make protocol libraries usable independent of (most of) the rest 
of Thunderbird source code. The main drawback to achieving this goal is 
that there is substantial divergence in some rather critical APIs for 
these libraries, with crypto, base64, charset support, and sockets (and 
streaming in general) being the major pain points I've already hit. 
There are already libraries that polyfill TextEncoder/TextDecoder and 
the mozTCPSocket API for other execution environments, courtesy of the 
Whiteout people, and I personally hacked together a WebCrypto polyfill 
for Node.js, so I don't think the problem is insurmountable.

Continuing with the assumption that we're adopting the goal, the 
question of how to integrate this stuff into the comm-central tree is 
quite pertinent, and probably should be considered before we start 
adding more libraries to the mix. Presently, JSMime and ical.js live 
outside of comm-central, and are imported by building their source code 
into a single file and dropping that into comm-central. The obvious 
downside to this approach is that it makes backporting from comm-central 
to these upstream repositories quite difficult and annoying. One 
plausible alternative is to use submodules, but my prior experience with 
build systems and release engineering has left me uncomfortable with 
having Thunderbird use more than 1 repository. Another alternative is to 
do what is done for NSPR and NSS, and that is to effectively check in 
the entire repository in a subfolder and update that periodically [5]. 
The main difference here from the current situation being the lack of 
the build-everything-into-one-file step.

Other areas of integration offer opportunities and challenges. With new 
code, and given the ease of running eslint on code (it's now possible to 
do so even in mozilla-central's build system!), we could actually make 
rigorous enforcement of style guidelines possible, something that I 
strongly advocate, given that our current codebase uses a dozen 
different styles (sometimes in the same file!) and the ensuing confusion 
that results from it. It is also feasible to build automation to collect 
code coverage results (see istanbul above) or even mandate the results, 
and automatically building documentation using jsdoc is possible [6].

Conformity is where challenges arise the most, unfortunately. The 
question of what license to use for the new smaller projects is 
pertinent [7]. The use of module format is rather more important on a 
technical level: I've used AMD-style for JSMime, UMD for latter projects 
(where I actually tried to get it working under Node.js, which doesn't 
support AMD style definitions). I'm sort of hoping that ES6 modules will 
eventually save the day, but I still don't know when that "eventually" 
is. Even the question of how to maintain these projects as individual 
repos is difficult, since GitHub's UI is actually quite hostile to the 
patch-based development we've used for Thunderbird in the past (want to 
update a patch? Now all old review comments are replaced with "XYZ 
commented on an outdated diff"!).

The biggest challenge, of course, is the sheer lack of a useful standard 
library in JS. Things like PromiseUtils.jsm or Task.jsm are small 
utilities that it feels overkill to make a separate library for, yet 
it's still annoying to have to rewrite them for every single library. 
You also have issues with things like crypto support, FFI, or sockets 
having completely different APIs on each of the different platforms 
(Node.js, Firefox OS-esque, Chrome WebApps, and xpcshell-based 
environments). It's also manifestly unhelpful that the web 
standardization groups take a dim view of legacy support and 
consequently implementations lack things we need like UTF-7 support for 
text decoding or MD4, MD5, and DES support in WebCrypto (needed for 
CRAM-MD5 and NTLM mechanisms). Another challenge I had was getting 
JSMime tests to be able to load files for its tests, since the 
filesystem is a concept that doesn't even exist in a web browser (and 
XHR is unusable from a file:// context). However, I will point out that 
it's generally fairly easy to get polyfills for necessary APIs (just a 
matter of putting sufficient time into it), and, as an added bonus, we 
get the most flexibility to choose to move to whatever underlying 
platform we most desire.



Thoughts/comments/questions/concerns/flames/trolls/kudos/theses/ramblings/dissertations?


[1] From what I can tell, modules are actually implemented. However, 
there is no way to actually load a module unless you make manual 
SpiderMonkey calls at the moment or restrict yourself to the shell (not 
xpcshell).
[2] Indeed, jsdoc is now much saner about UMD-style module definitions 
when you use classes, which makes me hopeful that actually being able to 
move es6 modules instead of the current boilerplate-laden solution will 
require less aggressive hints to make usable documentation.
[3] Specifically, I (and apparently ical.js) prefer the tdd UI for this, 
which isn't the default. The only real difference is that the tdd uses 
suite/test for its function names, while the default uses describe/it.
[4] The only reason for this is that if you have a file:// URL, you 
basically can't XHR to anything.
[5] Incidentally, this is what client.py in mozilla-central is used for.
[6] Although something like erroring out if things are undocumented 
doesn't appear to be easily possible, given how stupid jsdoc is when it 
comes to UMD-style modules.
[7] I've ended up using MIT for these libraries, largely because it's 
smaller in boilerplate, I think it's a slightly less fearful license for 
third-party users, it's what the emailjs libraries used, and I'm not a 
strong believer in the weak copyleft that MPL is. I'll point out that 
ICAL.js is using MPL 2 and Gaia (including, presumably, Gaia Email) uses 
Apache.

-- 
Joshua Cranmer
Thunderbird and DXR developer
Source code archæologist



More information about the tb-planning mailing list