RegExp performance

Thomas Reilly treilly at adobe.com
Wed Jun 25 06:43:25 PDT 2008


That's right we went from stack to heap to prevent a crash (although we still crash when heap runs out).

+1 on looking at using SM's RE impl.

-----Original Message-----
From: tamarin-devel-bounces at mozilla.org on behalf of Steven Johnson
Sent: Tue 6/24/2008 9:18 AM
To: Lars Hansen; Michael Daumling; tamarin-devel at mozilla.org
Subject: Re: RegExp performance
 
Re: the heap vs stack, I have a vague memory of Flash avoiding the stack
usage due to pathological cases being able to use too much stack space...
but I'd have to check the bugbase to be sure.

Re: PCRE, I don't think anyone is married to it. Flash will definitely have
a backwards-compatibility requirement and so would prefer something that can
be compatible, but otherwise we're definitely open for changes. Now all we
need is a volunteer :-)


On 6/24/08 8:05 AM, "Lars Hansen" <lhansen at adobe.com> wrote:

> I'm wondering if this is not just another nail in the coffin for PCRE in
> Tamarin.  We've discussed previously whether it would not be better to
> switch to the SpiderMonkey RE engine, in order to improve standards
> compliance and performance - I have the impression the SM engine is
> rather faster.  I don't know what the issues might be - presumably some
> compatibility concerns with existing content for Flash?  Anything else,
> apart from the work involved?
> 
> --lars
> 
>> -----Original Message-----
>> From: Michael Daumling
>> Sent: 24. juni 2008 16:59
>> To: Lars Hansen; 'tamarin-devel at mozilla.org'
>> Subject: RE: RegExp performance
>> 
>> Hi Lars,
>> 
>> Well spoken!
>> 
>> _pcre_exec uses the heap to allocate recursion information. There is a
>> #define NO_RECURSE in config.h. When I comment out this #define, pcre
>> apparently uses the stack, and no memory allocation takes place at
> all.
>> The change causes the ecma3/unicode tests to run significantly faster
> -
>> from 1:15 min to 56 secs.
>> 
>> What do the platform experts say? Can we compile pcre without
>> NO_RECURSE on some platforms?
>> 
>> Michael
>> 
>> 
>> -----Original Message-----
>> From: Lars Hansen
>> Sent: Tuesday, June 24, 2008 7:50 AM
>> To: Michael Daumling; tamarin-devel at mozilla.org
>> Subject: RE: RegExp performance
>> 
>> It would be interesting to know why that allocation is going on.  RE
>> execution should need to allocate for suspended decision points and
> for
>> capture arrays, but I would have hoped the engine had optimized those
>> significantly to avoid hitting the allocator all the time.  What do
> the
>> test cases look like, and why do they tickle such an allocation
> volume?
>> 
>> --lars
>> 
>>> -----Original Message-----
>>> From: tamarin-devel-bounces at mozilla.org [mailto:tamarin-devel-
>>> bounces at mozilla.org] On Behalf Of Michael Daumling
>>> Sent: 24. juni 2008 16:43
>>> To: tamarin-devel at mozilla.org
>>> Subject: RegExp performance
>>> 
>>> Hi all,
>>> 
>>> I have used GlowCode (yet another profiling tool that you can test
>> for
>>> 21 days) to examine RegExp performance in TT.
>>> 
>>> The big surprise was that _pcre_exec does a *lot* of mallocs and
> free
>>> (plus a ton of setjmp/longjmp). The difference between TC and TT is
>>> that TC has a global override of operators new and delete (in
>>> AvmShell.cpp) that uses FixedMalloc::Alloc/Free, and TT uses malloc
>>> and free via the default global new and delete operators.
>>> 
>>> Switching the pcre allocators (in pcre_globals.cpp amd
>>> RegExpClass.cpp) to using FixedMalloc showed a very nice
> improvement,
>>> but only in GlowCode. In the real world, the ecma3/Unicode tests
> were
>> slower!
>>> 
>>> I am out of knowledge here. Is it possible that
>>> FixedMalloc::Alloc/Free is slower in TT than in TC? Should I have
>>> tried a different (fast) replacement for malloc/free? Was the usage
>> of
>>> malloc/free intentional at all?
>>> 
>>> BTW: String handling (UTF-8 conversion etc) is negligible in that
>>> context. The big hog is malloc/free from within _pcre_exec (about
> 60%
>>> of total time).
>>> 
>>> Michael
>>> 
>>> _______________________________________________
>>> Tamarin-devel mailing list
>>> Tamarin-devel at mozilla.org
>>> https://mail.mozilla.org/listinfo/tamarin-devel
> _______________________________________________
> Tamarin-devel mailing list
> Tamarin-devel at mozilla.org
> https://mail.mozilla.org/listinfo/tamarin-devel

_______________________________________________
Tamarin-devel mailing list
Tamarin-devel at mozilla.org
https://mail.mozilla.org/listinfo/tamarin-devel



More information about the Tamarin-devel mailing list