treilly at adobe.com
Wed Jan 2 21:59:17 PST 2008
I just read the paper but it looks like a great allocator, I'll have to dig up the code next. jemalloc's size class variability probably makes it larger than what we're looking for. Like do we really need 2 and 4 byte size classes? Also TSLF handles small/large the same way, so no tree code to worry about. That said jemalloc's use of bitmaps probably makes it nicely suitable for adaptation as an automatic memory management library. If do my job right and make the vm support MMgc and my new TLSF based library there's no reason someone couldn't plugin a jemalloc based one (obviously there would quite a bit of additional work). If our "minimal" allocation libary performs poorly on the desktop there's no reason we couldn't use different allocators for different targets. jemallocs arena thread assocation concept maps directly to concept in our new library. Our new library also uses bitmaps and 2mb virtual regions.
There's also trade offs we're making that might not work well for other clients. For instance right now we support 8b to 16kb allocations using the same approach and either don't support larger allocations (think embedded device) or bail to direct to OS solution (ie mmap). The lot abstraction (which is allocator independent) plays a large role here obviously. We also have 8b of internal fragmentation per allocation which may be a problem, I think that can be brought down to 4b if we ditch DRC but we're not too worried about it because we also have taken steps in the VM to avoid lots of little allocations.
I'm wondering what those pretty pictures are showing exactly, do they capture internal and external fragmentation?
From: tamarin-devel-bounces at mozilla.org on behalf of Thomas Reilly
Sent: Fri 12/21/2007 1:36 PM
To: Stuart Parmenter; Benjamin Smedberg
Cc: tamarin-devel at mozilla.org
Subject: RE: space management
I'll do some research on jemalloc and respond with my thoughts.
From: Stuart Parmenter [mailto:pavlov at pavlov.net]
Sent: Fri 12/21/2007 10:02 AM
To: Benjamin Smedberg
Cc: Thomas Reilly; tamarin-devel at mozilla.org
Subject: Re: space management
Every time Benjamin said tcmalloc read jemalloc. tcmalloc's numbers
didn't look all that great. We're in the process of switching all of
Mozilla over to jemalloc for Gecko 1.9 and I'd like to figure out how we
can continue to take full advantage of it in the Mozilla 2 world. I'm
somewhat concerned about splitting up our heaps and worsening our
fragmentation on each one.
Benjamin Smedberg wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> Thomas Reilly wrote:
>> Part of our tamarin-tracing effort has been to re-evaluate MMgc.
>> There's ascpects of MMgc's design that we knew right off the bat would
>> make it non-ideal for embedded devices. We have a prototype in shop
>> to replace MMgc and I wanted to see what the folks who care about
>> space management think about it.
> I need to read this proposal more throughly before responding in detail,
> which will have to wait until January. ActionMonkey is in a state where we
> could prove real measurement of alternate allocators on a medium scale, but
> the XPCOMGC experiment is not ready for the really large-scale testing that
> we'll need to cover the "all of mozilla" case.
>> First, the problems with MMgc:
>> We don't think the 4kb block architecture is suitable for embedded
>> devices. The size class (40) X memory type (3) model leads to pretty
>> large base level heaps even for trivial applications. Running in a
>> fixed 128kb heap environment is never gonna happen with that model.
> Is this the stated goal? Mozilla's mobile targets are a 64MB device, which
> provides a heap significantly larger than 128k. I wonder if we will end up
> needing a different allocator backend.
>> Process wide GCHeap. We've witnessed in the player that if you have
>> two large applications when one goes away it leaves virtual memory
>> looking like swiss cheese. We want more compartmentalization of
>> address space.
> This is an interesting point: in Mozilla we have large object graphs that
> have caused significant fragmentation, but this is not something we can
> solve by having separate GC heaps: instead we will need to have solutions
> such as a low-fragmentation allocator and arenas that associate objects of
> similar lifetime together.
> Let me introduce Stuart (pav), who has been heavily involved in measuring
> Mozilla memory usage and has written up some blog posts on the topic:
>> MMgc spends most of its time in mark. We want to explore exact
>> marking techniques to speed up marking. Mostly this b/c of being
>> conservative but we can do more to speed it up than just going exact
>> (separating gc/non-gc pointers in object layout, prefetching etc).
> Is this penalty because we have to ask the page map whether it's a GC page
> for every single pointer value? Or what is the profile for marking? It is
> probably possible to provide exact mark information for large swaths of
> mozilla objects, provided enough time, but it won't be simple and will
> invove inventing at least one intermediate language.
>> Finalizers and the gc callbacks (presweep/postsweep/prereap) have been
>> problematic. Ideally the memory manager won't have any re-entrancy.
>> presweep and finalizers are kinda redundant, we think we want to
>> remove finalization and force clients to make do with presweep.
> Doesn't this mean that clients would have to keep an out-of-band cache of
> objects that need finalization-like behavior? This seems drastic if there
> are alternate solutions.
>> So the protoype I've been working on is basically re-working MMgc to
>> address these issues. We started with a juicy center based on TLSF
>>  and added the conservative/drc based collection around it. TLSF
>> works like traditional malloc implementations but it scales into
>> larger objects well with a uniform algorithm which makes it small (ie none
>> of the tree stuff dlmalloc has for medium size allocations).
> Pav has had significant sucess with tcmalloc. Is there a particular reason
> you chose rtmalloc?
>> There's no more managed vs. unmanaged distinction for write barriers,
>> there's only one write barrier. For tamarin's purposes this means
>> AvmCore is just like any other object. Unmanaged and managed objects
>> are handled the same and both are scanned (unless they ask to be
>> opaque). Unmanaged just means the gc will never free it.
> But if you have an "unmanaged" object, you aren't required to call
> writebarriers on it, correct?
>> The new library includes a "lot" abstraction to break up large
>> allocations. The lot theory is that large allocations lead to
> Do you have the code somewhere? I'm worried about this on a whole host of
> levels, and it's hard to imagine without seeing the actual API.
>> 1) accurate GC, heap is easy need to figure out native function
>> frames, forth/Box stack and JIT stack. Plan is to dump our thoughts
>> on this to a bug if you want to particpate.
> I'm sure we do, especially if accurate GC also implies a moving GC.
>> Sorry for the long post, hopefully this prototype will land in the
>> public soon as a tamarin-tracing branch, we'd like to air it out
>> publically before committing it to TT.
> You know, you can create a personal repository on hg.mozilla.org using these
> instructions and post work-in-progress trees to it:
> - --BDS
> - --
> Benjamin Smedberg
> Platform Guru
> Mozilla Corporation
> benjamin at smedbergs.us
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.5 (Darwin)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
> -----END PGP SIGNATURE-----
Tamarin-devel mailing list
Tamarin-devel at mozilla.org
More information about the Tamarin-devel