Code compilation alternative to Function()

Sean Silva silvas at purdue.edu
Sun Jan 12 15:14:11 PST 2014


On Sun, Jan 12, 2014 at 3:30 PM, Gustavs Tēbergs <fixplzsecrets at gmail.com>wrote:

> On 12 January 2014 07:48, Sean Silva <silvas at purdue.edu> wrote:
> > Can you provide some measurements where concatenating a string and
> parsing it is a bottleneck?
>
> I would not call it a bottleneck because the concatenating in most use
> cases only happens once at startup, that's why I used the word
> "latency". If you want to deliver data or programs to a browser client
> in a custom format, generating Javascript can cause a delay of 100 -
> 500ms before your application can start operating.
>
> I could string up some demo that takes this long in practice and
> emulates an approach that authors might want to use in the future.
>
> Simple and acute: take PEG.js and the included Javascript parser and
> parse some Javascript library (pretending this library is written in a
> different language), or the CSS parser and a CSS framework.
>
> I am proposing this because it occurred to me that the addition of
> this API would remove much of the cost of something being foreign to
> the Javascript/JSON/HTML system. The same way we all want to remove
> the cost of web applications being foreign to the operating system.
>
> > LLVM-dev here. Other than this being an IR, I don't see any resemblance
> to LLVM at all.
>
> > This is nothing like LLVM at all (not SSA).
>
> It is in the same vicinity. I meant to point out that insertion should
> not be done with ASTs or stack opcodes like I've seen in other places.
>
> Note that "op" returns a token that represents the result of the
> operation that can be used in subsequent operations. "get" and "set"
> would load and store these value tokens from/to variables and
> properties. I hope you see what I mean when I say this part is like
> LLVM syntax (quite SSA).
>
> > I'm not convinced that decompressing megabytes of JavaScript and parsing
> it (all in native code) is going to be any slower than having to JIT a
> JavaScript code generation routine, build up some horribly cache-unfriendly
> "sea of GC-heap-allocated linked nodes in JS" AST structure, then convert
> that "sea of linked nodes in JS" to the JIT's internal program
> representation. Decompress+parse is already a highly optimized code path as
> it is on the critical path of virtually every page load.
>
> That would depend on what the input to the generation routine is. You
> are right that SpiderMonkey AST is not more compact than the textual
> representation.
>
>
I'm talking about the output of the generation routine and time time it
takes to execute the generation routine.


> With other AST types one node can correspond to complex Javascript node.
> For example the PEG expression "a / b" executes as something like:
>
>     tmp = rule_a()
>     if(ok) result = tmp
>     if(!ok) {
>       tmp = rule_b()
>       if(ok) result = tmp
>     }
>     return result
>
> The output from a parser generator can quickly become large. This is
> problematic if I want to deliver the parser definition to the client
> to be generated clientside. I estimate that concatenating and

evaluating 100KB of code takes 100ms in Chrome. Make that 2x if using
> a last-gen CPU and 2x if using Firefox. How long on an ARM device
> would that be?
>

I'm consistently getting < 20ms on the attached test case on my 5+ year old
Core 2 Duo in both Chrome and FF.



>
> If we're talking about megabytes of Javascript I assume we're talking
> about Asm.js. Then the input would be something specialized to
> representing Asm.js programs, probably an array of opcodes. Like I
> mentioned, a common criticism of Asm.js is that it needs multiple
> characters to represent fundamental operations like "(x+y|0)" and
> "[x>>2]".


You can already send an array of opcodes down the wire and turn them into
an asm.js string on the client side. You can also build the string to be
parsed with no heap allocations besides a single growing list of pointers
and the final joined string; the way to do this is to build a small state
machine which reads the array of opcodes and appends constant strings to
the list (strings are by-ref so it is just appending a pointer to a list);
this state machine itself could even be a small asm.js program.


> Our custom opcode format could represent these with a single
> byte - the native parser may be fast, but can it parse 5x as many
> chars spanning multiple AST nodes more efficiently than our opcode
> decoder?
>

You are not taking into account the cost of executing the generator program
(which is JS, needs to be JIT'd, etc.) and the memory of all the JS objects
it is creating through this API.

And yes, a well-written native parser can easily beat a JS one by 5x.
Especially in the scenario you mention where it is upon startup and the JS
generator program would need to be JIT'd while executing.


>
> Also, if I'm not mistaken, both SM and V8 build GC-allocated AST trees
> while parsing, only to later copy the AST to a flat IR.
>
>
> https://github.com/ricardoquesada/Spidermonkey/blob/master/js/src/frontend/Parser.cpp
> https://github.com/v8/v8/blob/master/src/parser.cc


I'm not familiar with these codebases, but from a cursory look it seems
like only JS literals are actually materialized on the JS heap, everything
else are native objects. V8/Spidermonkey devs, am I reading this right?


>
>
> That's why the API suggestion example I gave aims to provide an
> abstract write-only interface to a flat SSA register IR. Our Asm.js
> decoder could skip tree node allocation altogether (ideally, maybe).


> > Unless the program representation used internally by the JS engines is
> exactly the same as the standardized representation (it isn't, and I doubt
> JS engines would let that implementation detail be standardized), then they
> still need to do a format conversion, but this time walking a linked sea of
> GC-heap allocated JS objects instead of doing a linear scan of the program
> text.
>
> Indeed, I am unsure about the viability of the API fitting all
> engines, and whether those that have a SSA IR could write it directly
> through these API calls and have only a verification pass upon
> ".done()" being called, or if it should be buffered and copied to the
> real IR later.
>
> However, the intention of my proposed API is exactly that it can be
> written in one go with no intermediate linked parts.

The intermediate representation does not need to be standardized,
> because the implementation of the API calls would convert the
> parameters to what is suitable for the IR - though the IR does need to
> be similar.
>

Standardizing those API calls is equivalent to standardizing the IR.


>
> If anyone knows of engines that do not fit this flat IR model, I'd
> like to hear about it.
> Then again, because this API is easy to polyfill (depends only on
> Function()), any implementation could start with a good-enough version
> and make gradual performance improvements.
>

I think that it would be a great idea to create this polyfill first and
benchmark it to see where the performance cost is. If it turns out that
string manipulation and parsing is the bottleneck (and there doesn't appear
to be any way to optimize that part, such as with the technique I mentioned
above for asm.js), then I think that's strong evidence that something like
this might be worthwhile to have.

However, before you have performance numbers on this, it's hard to justify
exposing so many JS engine details, since as you point out it is literally
just an optimization.

-- Sean Silva
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20140112/6a1b3015/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: bench.js
Type: application/x-javascript
Size: 1191 bytes
Desc: not available
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20140112/6a1b3015/attachment-0001.js>


More information about the es-discuss mailing list