[rust-dev] code generation and rustc speed

Graydon Hoare graydon at mozilla.com
Sat Jun 15 01:26:48 PDT 2013

I saw some more speculation on how to make rustc not-so-terriby-slow to 
compile today, including dramatic structural changes like splitting it 
into sub-crates. Please don't do this; it's papering over much more 
readily solvable problems. A 80kloc library does not need to be 
generating 16mb of object code.

Here are two simple listings of symbols we're generating:


Some things should be immediately apparent:

   - We're generating some (not all) _very_ large functions.
     Look at ty::mk_ctxt. It's a single constructor call.
     It produces 133kb of code. This has got to be some quadratic
     codegen bug.

   - There are 15762 functions but only 2039 unique function-sizes.
     We're not collapsing a _lot_ of identical functions.

     - We're generating a _lot_ of monomorphic instances.
       There are 363 hashtable:: functions in there -- about 50 copies
       of the std::hashtable module. Similar number of copies of
       std::vec. 68 copies of extfmt::conv_poly. 392 visit:: functions.

     - We're still generating a _lot_ of glue. Very little of it
       seems to be properly recycled -- there are many copies of
       each glue for types that vary only in a not-used type param,
       of identical size. 7134 of the 15762 functions are glue.

Beyond all that, there are pretty apparent systemic taxes as soon as you 
start digging into the LLVM. Take a look at, say, 
http://people.mozilla.org/~graydon/graph.svg -- the CFG from 
middle::trans::inline::maybe_instantiate_inline for example. It's a very 
modest 100-line function with 3 or 4 of nested matches. It produces 687 
CFG nodes, 903 edges, and 8068 LLVM instructions (and that's a _small_ 
one -- it only winds up as 7k of object code). There are a lot of things 
in there that shouldn't be being-generated.


More information about the Rust-dev mailing list