Observable GC

Steve Fink sphink at gmail.com
Thu Oct 26 15:44:05 UTC 2017

On 10/20/17 10:52 AM, Filip Pizlo wrote:
>> On Oct 20, 2017, at 10:29 AM, Mark Miller <erights at gmail.com 
>> <mailto:erights at gmail.com>> wrote:
>> There is a glaring inconsistency in the criteria we use to evaluate 
>> these issues. While we are understandably reluctant to admit more 
>> non-determinism into the platform via weakrefs, we have admitted an 
>> astonishingly greater degree of non-determinism into the platform via 
>> "Shared Array Buffers" (SAB), i.e., shared memory multithreading with 
>> data races.
>> The scenario we legitimately fear for weakrefs: A developer writes 
>> code that is not correct according to the weakref specification but 
>> happens to work on present implementations. Perhaps it relies on 
>> something being collected that is not guaranteed to be collected.
> Being collected when it shouldn’t have been?  Like a dangling 
> reference.  The game theory of security exploits forces 
> implementations to keep things alive longer, not shorter.
>> Perhaps it relies on something not being collected that is not 
>> guaranteed not to be collected. A later correct change to an 
>> implementation, or another correct implementation, causes that code 
>> to break. The game theory punishes the correct implementation rather 
>> than the incorrect code.
> Java had weak refs and multiple different implementations.  My claim, 
> as someone who implemented lots of weird GC algorithms in Java, is 
> that I don’t know of a case where different weak ref semantics breaks 
> something.  The only time that getting it wrong ever had an observably 
> bad effect is when you break weak refs or finalizers so badly that 
> they never get cleared or called, and then some resource has an 
> unbounded leak.  This usually results in not being able to run any 
> benchmarks that have weak references or finalizers, so you fix those 
> bugs pretty quickly.
> Here are the motivations:
> - Competitive perf motivates GC authors to try to free things as soon 
> as possible.  Smaller heaps mean more speed.  Some benchmarks won’t 
> run to completion if you aren’t aggressive enough.

I don't follow this. My GC optimization work usually pushes in the 
opposite direction -- scanning less, not more (but hopefully not 
*collecting* much less). We [spidermonkey] partition the heap in all 
kinds of ways so we don't have to collect the whole thing all the time. 
It's partitioned into processes, the processes have thread-local heaps, 
and the thread-local heaps are partitioned into 
independently-collectable zones specific to various purposes (in the web 
browser, they're for tabs, iframes, and some internal things.) It 
doesn't seem unlikely to have a weakref in a lightly-used zone pointing 
into a more active zone. So yes, we'd aggressively collect the pointee 
zone to keep the heap small, but scanning the pointer zones is a waste 
of time. And we're always looking for more ways to subdivide the heap, 
given that the overhead of GC is mostly determined by the amount of live 
stuff you have to scan.

Generational GC similarly partitions the heap, for the same reason. If 
nothing is surviving minor GCs, you won't bother doing any of the major 
GCs that would collect the weakref pointees. I have considered (and I 
believe other engines have implemented) having more generations, by 
splitting off very long-lived (or alternatively, observed to be 
read-only) portions of the tenured heap and not scanning those during 
most major GCs. (I haven't looked enough to decide whether the extra 
cost and complexity of the write barriers is worth it for us at this point.)

That said, we *could* treat a weakref as a cue to collect the source and 
destination zones together. Which would mean weakrefs would be something 
of a "go slow" bit, but it might help contain the damage.

> - The need to avoid dangling references forces us to keep alive at 
> least what we need to, and sometimes a bit more.
> I guess a program could rely on the weak references actually being 
> strong in some implementation.  I haven’t heard of Java programs ever 
> doing that.  It’s unlikely to happen because the major implementations 
> will try to clear weak refs as aggressively as they can to compete on 
> benchmarks.

GC-related competition on benchmarks gets really silly without anyone 
even intentionally gaming things. I remember making a minor improvement 
and seeing a benchmark score absolutely plummet. I tracked it down to 
the benchmark having a "warmup" phase to each subtest (which is 
important for eliminating variance that prevents detecting small 
changes, so it's not a wholly bad thing). The minor change shifted 
several GCs from the unmeasured warmup phase into the measured phase. At 
which point I realized that much of our parameter tuning, in particular, 
had been having the effect of hiding GCs under the covers, not 
necessarily speeding anything up. If a benchmark score started depending 
on the promptness of weakref cleanup, then you're right, we'll probably 
end up messing up our heap partitioning to satisfy whatever arbitrary 
object graph it's testing, at a cost to real-world performance.

We have multiple benchmarks and treat in-the-field telemetry as the gold 
standard, so the worst of the effects get caught eventually. The length 
of the feedback loop matters, though.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20171026/429dbd97/attachment.html>

More information about the es-discuss mailing list