Regexp microbenchmarks

Edwin Smith edwsmith at adobe.com
Mon Jun 9 09:00:44 PDT 2008


RE->ForthIL means interpret then trace hot paths.  SSE ops could be
supported in ForthIL and therefore LIR.  Interesting to explore the
intrepret-then-trace idea for RE matchers.

RE->LIR means compile once, that's it.  Expanding LIR to support
full CFG's in SSA notation is significant, but potentially useful
if traces are more than just linear segments (trace trees and tree
folding means we have LIR for whole CFG regions).

Ed

> -----Original Message-----
> From: tamarin-devel-bounces at mozilla.org [mailto:tamarin-devel-
> bounces at mozilla.org] On Behalf Of David Mandelin
> Sent: Friday, May 30, 2008 4:55 PM
> To: Thomas Reilly
> Cc: tamarin-devel
> Subject: Re: Regexp microbenchmarks
> 
> Makes perfect sense. I figured the mention of LIR was just crossed
TLAs,
> but I'm still learning so I figured I'd better ask before I sent any
> interns down the wrong alley.
> 
> Dave
> 
> Thomas Reilly wrote:
> > I think I'm using the wrong terminology, the only thing that ever
should be
> writing raw LIR is
> > the tracer.   A RE implementation would produce machine forth just
the
> ForthWriter class does
> > for ABC.  Instead of a MethodWriter you'd have a REWriter.    Then
this
> machine written forth
> > would include calls to RE library routines that are hand written
forth and
> compiled into the player.
> >
> > That make more sense?
> >
> > -----Original Message-----
> > From: David Mandelin [mailto:dmandelin at mozilla.com]
> > Sent: Fri 5/30/2008 11:35 AM
> > To: Thomas Reilly; tamarin-devel
> > Subject: Re: Regexp microbenchmarks
> >
> > There's one thing about the RE compilation topic that keeps
confusing
> > me, so I have to ask about:
> >
> > Thomas Reilly wrote:
> >
> >> 5) Wouldn't it rock the kazbah to have the RE engine compile its
RE's to
> LIR and use TT's vm for RE execution?
> >>
> >>
> > When I first heard about the idea of compiling REs, I assumed it was
to
> > compile to Forth IL. I thought the compiled code would have the form
of
> > a backtracking search, so it would have control flow, and be
directly
> > expressible in Forth. And tracing this would tend to optimize for
hot
> > paths and all that good stuff.
> >
> > But a LIR trace is linear, so at first it wasn't even obvious to me
that
> > it could be done in LIR. But I guess it could be done in LIR by
> > generating several traces and patching them together? This sounds
like
> > more implementation work, but it also seems like it would make it
easier
> > to do RE-specific optimizations (like the SSE2) and also avoid path
> > explosion with the options and backtracks.
> >
> > So:
> >
> > - Do I have any idea what I'm talking about here?
> >
> > - Between IL and LIR, which do you think is better to start with and
why?
> >
> >
> > Dave
> >
> >
> 
> _______________________________________________
> Tamarin-devel mailing list
> Tamarin-devel at mozilla.org
> https://mail.mozilla.org/listinfo/tamarin-devel


More information about the Tamarin-devel mailing list