Weak Reference proposal
Jonas Sicking
jonas at sicking.cc
Wed Feb 17 18:45:52 UTC 2016
Yeah, you are right. NodeIterators, and presumably Ranges, suffer from
the observer problem. I.e. they want to be notified about mutations to
the DOM, but only as long as the NodeIterator/Range stay alive.
My understanding is that this is one of the more common scenarios
where the need for weak-references come up. Where you want to register
something as an observer, but don't want the notification mechanism to
hold a strong reference to the observer.
Fortunately though, neither NodeIterators nor Ranges expose this in
their public API. I.e. there is no way to use them to detect when GC
happens.
/ Jonas
On Wed, Feb 17, 2016 at 5:23 AM, Joris van der Wel
<joris at jorisvanderwel.com> wrote:
> Here is an example of using a NodeIterator:
>
>
> ```
> const jsdom = require("jsdom");
> const document = jsdom.jsdom(`<a></a><b></b><c></c>`);
>
> let it = document.createNodeIterator(document.body);
> console.log(it.nextNode().nodeName); // BODY
> console.log(it.nextNode().nodeName); // A
> console.log(it.nextNode().nodeName); // B
> console.log(it.nextNode().nodeName); // C
> console.log(it.nextNode()); // null
>
> it = document.createNodeIterator(document.body);
> console.log(it.nextNode().nodeName); // BODY
> document.body.removeChild(document.body.firstChild); // This remove
> operation updates the internal state of the NodeIterator
> console.log(it.nextNode().nodeName); // B
> console.log(it.nextNode().nodeName); // C
> console.log(it.nextNode()); // null
> it = null;
> ```
>
> In the case of NodeIterator, there are currently (read: in ES6) two
> spec (DOM whatwg) compliant implementations possible:
>
> 1. Keep a history of all changes a Document has gone through, forever.
> 2. Keep a list of all NodeIterators which have been created for a
> Document, forever.
>
> jsdom uses solution #2. This not only leaks memory, but remove
> operations become slower as more and more NodeIterator's are created.
> (however as domenic described earlier we limit this list to 10 entries
> by default).
>
> The conflict between the DOM spec and ES6 is that we can not detect if
> a NodeIterator is still in use by code outside of jsdom:
>
> ```
> it = document.createNodeIterator(document.body);
> console.log(it.nextNode().nodeName); // BODY
> // ... wait an hour ...
> console.log(it.nextNode().nodeName); // A
> it = null; // and only now we can stop updating the NodeIterator state
> ```
>
> (There used to be a it.detach() method for this purpose, but this has
> been removed from the spec.)
>
> Being able to keep a list of NodeIterator's weakly would be the only
> solution if we want to avoid leaking resources.
>
> Weak references might also be required for MutationObserver, although
> I've not yet looked at this feature extensively, so I could be wrong.
> Other features which you could implement using a weak reference (like
> in the live collections) could be implemented using ES6 Proxy instead.
>
> XMLHttpRequest, fetch, WebSocket, etc would even require a something
> similar to a phantom reference (like in java) so that we can close the
> connection when the object is no longer strongly or weakly referenced.
>
> I would also really like to use weak references not just for jsdom,
> there are some uses cases where they can simplify my code.
>
> Gr. Joris
>
>
> On Wed, Feb 17, 2016 at 9:41 AM, Jonas Sicking <jonas at sicking.cc> wrote:
>>
>> On Tue, Feb 16, 2016 at 11:02 PM, Domenic Denicola <d at domenic.me> wrote:
>> >> For each NodeIterator object iterator whose root’s node document is node’s node document, run the NodeIterator pre-removing steps given node and iterator.
>> >
>> > Rephrased: every time you remove a Node from a document, you must go through all of the document's NodeIterators and run some cleanup steps (which have the effect of changing observable properties and behavior of the NodeIterator).
>>
>> Could you implement all of this using MutationObservers? I.e. have the
>> NodeIterators observe the relevant nodes using MutationObservers?
>>
>> The only case that I can think of where the DOM could use weak
>> references is for the getElementsByTagName(x) function. This function
>> will either return a new NodeList object, or an existing one. The
>> reason it sometimes returns an existing one is for performance
>> reasons. We saw a lot of code doing:
>>
>> var i;
>> for (i = 0; i < document.getElementsByTagName("div").length; i++) {
>> var elem = document.getElementsByTagName("div")[i];
>> doStuffWith(elem);
>> }
>>
>> This generated a ton of NodeList objects, which are expensive to
>> allocate. Hence browsers started caching these objects and returned an
>> existing object "sometimes".
>>
>> The gecko implementation of "sometimes" uses a hash map keyed on
>> tagname containing weak references to the returned NodeList. This is
>> observable by for example doing:
>>
>> document.getElementsByTagName("div").foopy = "foopy";
>> if (document.getElementsByTagName("div").foopy != "foopy") {
>> // GC ran between the getElementsByTagName calls.
>> }
>>
>> However this exact behavior is not defined by spec. But I believe that
>> all major browsers do do something similar for performance reasons.
>> (This API is as old as it is crummy. And it is no surprise that it is
>> poorly used).
>>
>> But it likely would be possible to write an implementation of
>> "sometimes" which doesn't use weak references, at the cost of higher
>> memory usage.
>>
>> / Jonas
>
>
>
>
> --
> github.com/Joris-van-der-Wel
More information about the es-discuss
mailing list