Code compilation alternative to Function()

fixplzsecrets at gmail.com fixplzsecrets at gmail.com
Sat Jan 11 16:34:12 PST 2014


I want to propose an idea and get feedback on viability of this or some  
related addition to the language.

ES should have a way to build functions at runtime - like the approach of  
building a string of code and using Function(code) - but using API calls  
to assemble the code. Special support for this from language  
implementations would allow programs that depend on dynamic evaluation of  
large blocks of code to execute with less latency - bypassing the cost of  
concatenating a large string and then requesting the runtime to parse it  
character by character as with the use of Function().

Here is my naive guess at what the API should look like:

     func  = CompiledFunction.create()
     arg0  = func.arg(0)
     val   = func.get(arg0)
     val2  = func.op('*', val, func.literal(2))
     func.return(val2)
     result = func.done()

This builds "function(a){ return a * 2 }"

As you can see, this API provides a LLVM-like language.

There would be more methods: "get", "set" for reading and writing  
variables and fields, "call" for function calls, "startIfElse" for writing  
if/else blocks, "startFor", "startWhile" for loops, "break", "continue",  
and others for every possible construct.

It is important that runtime implementations support these operations with  
low overhead. Ideally it should be a nearly direct means of writing to the  
intermediate representation format of the runtime.


Motivation:
There are a few types of programs that use generated Javascript in  
browsers:

- PEG.js
PEG.js is a library for facilitating parsing of custom text formats. It  
works by converting a parser notation to Javascript code that performs the  
specified parsing rules.
Ability to use custom data formats is a piece of Unix philosophy that it  
seems should be supported on the Web. As is PEG.js has to depend on the  
Function() quirk, which may not work if content security policy is enabled.

- Opal http://opalrb.org/try/ &c
Opal aims to implement Ruby running in Javascript environments. It is  
among other projects with similar aims for different languages operating  
in different ways.
If special support for this API is available, web applications depending  
on tools like this could deliver their code and initialize with lower  
latency.

- Asm.js
Asm.js is a special format of Javascript suitable for representing native  
executable code. However, one complaint concerning it is that it is not a  
suitable lexical representation of the content it encodes - Asm.js  
programs are larger than equivalent native executables, and parsing time  
is proportional to this size because content has to be scanned character  
by character and code size in present experiments can exceed 10 megabytes  
or more, and parsing can be inefficient unless the runtime implementation  
includes a specialized parser for this format. Because of this, I have  
seen criticism suggesting that proponents of Asm.js should design a  
bytecode format for this use case instead of sending blobs of Javascript.

But defining an adequate future-proof bytecode format ahead of time is  
difficult, and is almost a separate concern to what Asm.js is concerned  
with.


What if independent parties could design their own program representation  
format that can be converted to executable form as needed?

A party delivering a web application can choose a format for their Asm.js  
code, and include a decoder for it in their application startup procedure.
If the runtime receiving this application supports the code creation API,  
the decoder can bypass the usual process of decompressing megabytes of  
Javascript code and then having to parse it.

To illustrate what I mean:
Eval example:

     load("./blob.js", function(input) {
       var ts = {}
       ts.start = Date.now()

       Function(input)()
       MyBlob.foo

       ts.end = Date.now()
       console.log("Eval time", ts.end - ts.start)
     })

Codegen example:

     load("./blob.js", function(input) {
       var ast = esprima.parse(input)

       var ts = {}
       ts.start = Date.now()

       Function(escodegen.generate(ast))()
       MyBlob.foo

       ts.end = Date.now()
       console.log("Codegen time", ts.end - ts.start)
     })

In the second example, the line "Function(escodegen.generate(ast))()" is  
less efficient and takes longer to run, even though we started with the  
AST loaded in memory, which is a form more suitable to being converted to  
an executable function.

What if we could make the second example execute faster than the first one?

I think this seemingly cosmetic capability would give web programs a  
degree of freedom that could end up solving some design problems.


More information about the es-discuss mailing list