Weak Graph

Jason Orendorff jason.orendorff at gmail.com
Wed Nov 11 16:05:22 UTC 2015


On Sun, Nov 8, 2015 at 4:45 AM, Jussi Kalliokoski
<jussi.kalliokoski at gmail.com> wrote:
> Perhaps graph as a concept is too wide for expressing this, but it surely
> is not a linked list either. It may look like one in some cases though,
> when there's only one lineage and no branching. However that is not
> the use case here as when application state gets transformed to a view
> representation it may have various transformations applied to it, such as
> sorting, mapping or filtering.

The last sentence here doesn't seem connected to the rest. A linear
list of versions can go through a data transformation pipeline of
sorts, maps, and filters just fine.

I don't see where lineage or branching comes into any of your
examples. I can see how it could, but I think I must be missing
something here.

> On Fri, 6 Nov 2015 at 17:31 Jason Orendorff <jason.orendorff at gmail.com>
> wrote:
>> Then it looks a lot like a stream, in the
>> functional reactive programming sense. Let the user (in this case, the
>> renderer) buffer the diffs as needed; it knows when to reset the list.
>> And no need for fancy data structures: it could just be an Array.
>
> That misses the point, to say the least. The idea of React is that you can
> represent the UI as a declarative function of a snapshot of the state, so
> the view doesn't have to care about async. With FRP you
> subscribe/unsubscribe to asynchronous streams which, not unlike the idea
> here, can be transformed just like normal data structures and forked (to
> fork an immutable data structure is to just pass the reference). The
> difference is that streams are an inherently async structure, [...]

FRP is not inherently async. As a programming model, I'd say it's
inherently about user code not having to care about time.

Anyway, my point here is not "you should drop what you're doing and do
FRP instead" but rather "if FRP can handle this without new GC
primitives, maybe you can too".

> while what I'm
> trying to do is not. The idea being not only that the easy case of insert
> without transforms is O(1), but almost every use case can be further better
> optimized by knowing the previous state of the data structure.

Right, the state is a left fold over a series of events. Like in FRP.

> You can implement this with streams, but that will just be an unnecessary
> abstraction level offering no simplification whatsoever while making the
> concept needlessly async.

I dunno, you sort of lost me I guess. Streams come to mind because of
what you're doing: taking a series of events, transforming them, then
applying them one by one, as they arrive, to a mutable data sink.

> Another significant difference between this and
> FRP is that streams require imperative subscribe / unsubscribe, which is
> basically just sophisticated reference counting, while having the same
> issues (user after free -> update after unmount, leaks).

This is true, and I think it's your strongest point. Furthermore in
support of your side here, even without WeakGraph, I think leaks in
FRP are more wasteful than they would be in your model, because in FRP
the "deltas" are pushed through the system until somebody turns them
off, rather than pulled on demand.

Three alternative models:

1.  In Elm, the data flow graph is static. The system manages
subscriptions. Drawback: the user can't mutate the graph procedurally.

2.  A system could allow the data flow graph to change, not
*procedurally* exactly, but when a diff is applied that changes the
UI. Since the system knows about diffs, it could then automatically
manage subscriptions based on what's actually in the UI. Subscribe on
insert, unsubscribe on delete. No manual subscription management for
the end user; within the framework, you can use reference counting
etc.

3.  Implement your system using a "WeakGraph"-like object that keeps
strong references to all revisions in a fixed-sized window of recent
history. Periodically clear old patches and revisions from this cache.
Instead of `getLineage(a, b)`, we have a method `diff(a, b)` that
typically does exactly the same thing, returning an array of patches.
But `diff` may find that the user is trying to diff two versions that
are not both in the history. Then it simply does a full diff of the
two states, effectively generating a fake lineage. Note that the
performance is not *amortized* constant time: `diff()` is *always*
fast except when diffing something quite old, which shouldn't happen.
And the memory usage of the differ is more predictable than
"WeakGraph". Drawback: fake lineages. But so what?

-j


More information about the es-discuss mailing list