<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<div class="moz-cite-prefix">On 2/14/2018 5:56 AM, Henri Sivonen
wrote:<br>
</div>
<blockquote type="cite"
cite="mid:CAJQvAue6+ZaLRwwPVm5dgf90owYmRh3ptPKohdMsWQN8_M4dyg@mail.gmail.com">To
elaborate on this a bit more to make the case a bit better than a<br>
run-of-the-mill "Rewrite it in Rust" post from the Rust Evangelism<br>
Strike Force:</blockquote>
<p>As one of the very few people with experience with both Rust and
the implementation of a mail client, let me give my thoughts:</p>
<p>One of the challenges of building an email client is the plethora
of protocols to support: IMAP, POP, MIME, SMTP, NNTP, LDAP, JMAP,
CardDAV, RSS, just off the top of my head. Rather like how a web
browser must approach HTTP, an email client needs to tightly
couple with these protocols and cannot rely on high-level
interfaces to access these. Moreover, with the slight exception of
IMAP and the very-not-slight exception of LDAP, these protocols
are simple enough that a command-level binding isn't really worth
the risk of API skew. Thunderbird is going to need to own these
implementations, and I don't think it reasonable to expect that we
can find the necessary libraries already existing in sufficient
form (arguably, even as a base) on crates.io or npmjs.org or
whatever other package library exists.</p>
<p>In terms of ability to implement this sort of stuff cleanly, Rust
is one of the best contenders. Email exists in a very annoying
world where you have to deal with binary data as if it were
strings at times, without being able to shove it entirely through
a charset converter. Rust's standard API generally moves most of
the string functions onto binary arrays (with strings being a
binary array with some extra validity criteria), and good-quality
crates continue the ability to apply string functionality on
binary (e.g., you can do binary regex!). This is in very stark
contrast to JS, where you're tormented by the sharp distinction
between Unicode strings and binary arrays, made even worse by
several APIs representing binary strings as JS strings with
characters in the range \x00-\xff.</p>
<p>For the language ergonomics itself: Rust doesn't have the
platform fragmentation issue that JS has, although its repertoire
of second-level libraries is weaker than I'd want to see. I can't
speak to the quality of its threading or async I/O support, since
I've not needed it in projects. The error handling support is
fairly nice (you essentially return a Result, which is either the
value you wanted to return or a specific error message), although
doing the final orchestration in main is surprisingly painful.
Binding to C is quite easy, both in direction of calling C from
Rust and in calling Rust from C, but the ABI isn't stable, and you
have to specifically reexport functions to be called from C.
Generating the necessary bindings for C++ exists (that's bindgen),
and I know there was a patch floating around sometime for doing
the same for XPIDL, although I don't know if anything has come of
it.</p>
<p>The main difficulty in programming the language itself is that
it's very explicit about lifetimes and who's responsible for
cleaning up objects. It's not as simple as GC-based systems,
that's for sure. Building cyclic data structures is a major
challenge, and the utilities for building these semantics aren't
clearly documented in easy-to-follow examples. But building such
data structures usually isn't the main focus of the application.
Certainly, it's possible to make a lot of progress without having
to tackle the core data representation format (i.e., the
folder/server/database/message hierarchy). As painful as lifetimes
are, they do make explicit issues like "how long do we expect a
buffer passed into this method to live for?"<br>
</p>
<p>As a thought experiment, here is roughly the complexity of, say,
rewriting the NNTP connection stuff in Rust:</p>
<ul>
<li>Build an NNTP crate that's roughly equivalent in functionality
to nsNNTPProtocol. The interface to this crate would probably be
an NNTP struct that has several methods akin to nsINNTPService.</li>
<li>It would take some investigation to find hooking up various
pieces of functionality. The network interfaces are generally at
the level of Read + Write traits, although we'd probably need
tokio's async I/O representation. There does appear to be a
connection pool crate in existence already (r2d2), but I have no
idea what it's quality is.</li>
<li>Have a second crate that would interface between the Rust
traits and the Gecko code. In particular, this would handle
stuff like reading preferences, driving the network via netwerk
rather than the low-level I/O (important for SSL issues!), and
logging. Actually, if it's possible to implement XPIDL in Rust,
then this crate would probably provide the implementation of
nsNntpService. If that's not possible, then this would reexport
APIs as simple C calls that can be called by nsNntpService.<br>
</li>
<li>That would be about 6,000 lines of C++ code moved to Rust. The
internal code structure would change from a URL-centric state
machine to async I/O driven by individual function calls
(although we'd probably still have to rely on URLs since the
main way of listening for async events is nsIUrlListener at the
moment).</li>
<li>Implementation complexity is on the order of a GSoC project
for a weak student, two weeks or so for a knowledgeable
full-time developer. The trickiest bits are the integration, but
this work can be shared with future efforts.</li>
</ul>
<p>In a comparison of difficulty with JS, it's certainly easier to
do iterative development on the NNTP core in Rust--without
platform fragmentation, you could easily spin up a testbed outside
of the Mozilla environment and leave the integration focus until
after you've made the protocol fairly robust. With JS, you'd have
to tackle the integration first; in practice (from experience),
that means focusing on trying to match test results before trying
to spin it up in real-world dogfooding. This is not entirely
widely applicable, as the other protocols more or less need SSL
support these days (which is trickier in any environment), but I
will point out that JS lacks any standard representation of
necessary concepts, particularly streams (both in the async
iterator sense and in the networking sense). I suspect shaking out
bugs to take less time in Rust than in JS, as you don't have
stupid typos ending up as dynamic errors, and Rust tends to force
you to confront error handling from the beginning.<br>
</p>
<pre class="moz-signature" cols="72">--
Joshua Cranmer
Thunderbird and DXR developer
Source code archæologist</pre>
</body>
</html>