RegExp performance

Lars Hansen lhansen at adobe.com
Tue Jun 24 08:05:50 PDT 2008


I'm wondering if this is not just another nail in the coffin for PCRE in
Tamarin.  We've discussed previously whether it would not be better to
switch to the SpiderMonkey RE engine, in order to improve standards
compliance and performance - I have the impression the SM engine is
rather faster.  I don't know what the issues might be - presumably some
compatibility concerns with existing content for Flash?  Anything else,
apart from the work involved?

--lars

> -----Original Message-----
> From: Michael Daumling
> Sent: 24. juni 2008 16:59
> To: Lars Hansen; 'tamarin-devel at mozilla.org'
> Subject: RE: RegExp performance
> 
> Hi Lars,
> 
> Well spoken!
> 
> _pcre_exec uses the heap to allocate recursion information. There is a
> #define NO_RECURSE in config.h. When I comment out this #define, pcre
> apparently uses the stack, and no memory allocation takes place at
all.
> The change causes the ecma3/unicode tests to run significantly faster
-
> from 1:15 min to 56 secs.
> 
> What do the platform experts say? Can we compile pcre without
> NO_RECURSE on some platforms?
> 
> Michael
> 
> 
> -----Original Message-----
> From: Lars Hansen
> Sent: Tuesday, June 24, 2008 7:50 AM
> To: Michael Daumling; tamarin-devel at mozilla.org
> Subject: RE: RegExp performance
> 
> It would be interesting to know why that allocation is going on.  RE
> execution should need to allocate for suspended decision points and
for
> capture arrays, but I would have hoped the engine had optimized those
> significantly to avoid hitting the allocator all the time.  What do
the
> test cases look like, and why do they tickle such an allocation
volume?
> 
> --lars
> 
> > -----Original Message-----
> > From: tamarin-devel-bounces at mozilla.org [mailto:tamarin-devel-
> > bounces at mozilla.org] On Behalf Of Michael Daumling
> > Sent: 24. juni 2008 16:43
> > To: tamarin-devel at mozilla.org
> > Subject: RegExp performance
> >
> > Hi all,
> >
> > I have used GlowCode (yet another profiling tool that you can test
> for
> > 21 days) to examine RegExp performance in TT.
> >
> > The big surprise was that _pcre_exec does a *lot* of mallocs and
free
> > (plus a ton of setjmp/longjmp). The difference between TC and TT is
> > that TC has a global override of operators new and delete (in
> > AvmShell.cpp) that uses FixedMalloc::Alloc/Free, and TT uses malloc
> > and free via the default global new and delete operators.
> >
> > Switching the pcre allocators (in pcre_globals.cpp amd
> > RegExpClass.cpp) to using FixedMalloc showed a very nice
improvement,
> > but only in GlowCode. In the real world, the ecma3/Unicode tests
were
> slower!
> >
> > I am out of knowledge here. Is it possible that
> > FixedMalloc::Alloc/Free is slower in TT than in TC? Should I have
> > tried a different (fast) replacement for malloc/free? Was the usage
> of
> > malloc/free intentional at all?
> >
> > BTW: String handling (UTF-8 conversion etc) is negligible in that
> > context. The big hog is malloc/free from within _pcre_exec (about
60%
> > of total time).
> >
> > Michael
> >
> > _______________________________________________
> > Tamarin-devel mailing list
> > Tamarin-devel at mozilla.org
> > https://mail.mozilla.org/listinfo/tamarin-devel


More information about the Tamarin-devel mailing list