A trilemma in functional reactive programming models
19 June, 2023
A few months ago I tried to implement a functional reactive programming library in Rust. It was a good experience, but ultimately I abandoned it due to limitations of Rust that were preventing me from making a system as ergonomic as I would like. However, that's not what this post is about.
The FRP framework I am most familiar with is Reflex, and to start, I mostly tried to copy its semantics. It being Rust, I was determined to make it performant, eventually in constant factors, but to start at least getting the asymptotics right. However I ran into an algorithmic difficulty which led me to question the semantics exposed by Reflex and other similar frameworks. That's what this post is about.
Eventually I concluded that there are a variety of alternate semantics that are both easier to implement and more efficient, without losing any important expressivity (at least for GUI use cases).
This post assumes general familiarity with push-based (or hybrid push/pull) higher order FRP like Reflex or Sodium. Note that this is actually a pretty small category. Most older FRP implementations are pull-based (i.e. not efficient at all) and many newer frameworks commonly referred to as FRP (like ReactiveX and its derivatives) lack the semantic foundations which make "true" FRP both very expressive and hard to implement.
I will be using terminology from Reflex but unfortunately this differs quite a bit between different FRP frameworks.
In particular, some systems use Signal
or Stream
for what Reflex calls Event
, but I prefer "event" as it is more intuitive.
However, there is one important potential terminological confusion with the term "event" which deserves clarification.
In mainstream GUI systems, an event usually refers to an occurrence of an abstract potentially-happening thing.
In (first class) FRP, we build a reactive system by using various combinators to construct values that represent the abstract potential behavior of the system.
So an "event" object represents something that may fire with actual concrete values at various points in the program.
(Denotationally, an event for values of type a
is like a mapping Time -> Maybe a
.)
Here "event" always refers to this abstract object, not the concrete occurrence of the event.
Event graphs
Efficient implementations of FRP are primarily concerned with a specific task, that is taking input events (e.g. button clicks) and activating a subset of observable output events.
The output events could have some IO effect (e.g. update widgets on the page) or update internal reactive data cells (Behavior
/Dynamic
in Reflex).
One way to do this would be to "run" the entire known graph on each frame, like in an immediate-mode GUI. This isn't a totally crazy way to operate for most applications, but as the ongoing debates in the web world about virtual DOM diffing vs. fine grained reactivity show, it is less than ideal.
FRP implementations typically use behind the scenes something analogous the "observer pattern" from classic retained-mode GUI frameworks to attach to each event a list of downstream events which need to be processed if the upstream event fires. This allows them to (basically) start from the input events and wake up events downstream only as necessary until reaching some subset of the existing output events at the other end of the graph.
With basic event combinators like filter
, map
, etc., this is straightforward.
The complications arise when you add complex combinators that have more than a single input:
merge :: Event a -> Event b -> Event (These a b)
switch :: Behavior (Event a) -> Event a
Behavior
is a time-varying FRP value, and These
is a very nice Haskell type representing two optional values where at least one must be present:
data These a b
= This a
| That b
| These a b
The merge
1 combinator
The hard part about merge
is that it waits on two input events, but after either one of them fires, we can't yet know what the resulting value will be.
We have to first know whether the other side will also fire.
The only way to get the correct answer and do minimal work, is to process potentially triggered events in a topological ordering of the relation implied by input dependencies.
In other words, if one input to a merge
fires, we know that the merge
event itself will fire, but we should delay processing it until all "earlier" events (which necessarily include its inputs) have been processed.
To handle this, we can assign each event a "height" based on its inputs, and use these as priorities in a queue.
A single-input event, like map
or filter
, would have a height one higher than its input, and merge
would have a height one higher than the maximum of its inputs.
This automatically gives a topological ordering by construction.
The switch
combinator
In a fully static event graph, assigning heights is easy, because they can be calculated directly when an event is constructed.
But we run into a problem with switch
.
The value of the Behavior
can change as time goes on, and because FRP is higher-order, it can come from "anywhere" so there is no statically known subset of events it could be.
Thus, if we want to use the "height strategy" to handle merge
events, we'll need to dynamically update the height of a switch
event when it changes.
Just updating the switch
event itself isn't a problem, but we need to maintain the invariant that an event has a higher height than its inputs.
We now need to trace down the graph, finding any downstream events which transitively have it as an input, and update their heights to maintain the invariant.2
Now we're back to the same worst case performance as an immediate-mode GUI. In practice, it usually won't be that bad, but we have no easy ways to reason about when it will happen.
The trilemma
We need to make more precise what it would mean to do efficient event propagation. Obviously we need to do work at least linear in the number of events that actually fire in a frame. Additionally, any event that does fire may be the input to another event, which is therefore "potentially firing" but may decide to fire or not depending on the values it receives. So efficient event propagation would mean doing work linear in the number of potentially firing events in a frame (which includes those that actually do fire). Now we get to the trilemma:
If we have both switch
and merge
, changing a switch
and running an event propagation frame requires in the worst case work linear on the total number of existing events, even if this dwarfs the number of potentially firing events.
- If we get rid of
switch
, we can assign heights statically and process potentially firing events in a priority queue. - If we get rid of
merge
3, we don't even need heights, and we can process potentially firing events in depth-first order. - If we keep both, to process
merge
s correctly we need to produce a new topological ordering which respects the new constraints changed by aswitch
.
Formally proving that the final case cannot be done efficiently is beyond the scope of this post, but hopefully the intuition is clear.
Because a switch
event can choose "any" event as its input, it can add an arbitrary ordering constraint to the graph, which means we can reduce the problem to online topological ordering.
Actually existing GUIs
Given that I was trying to implement FRP in Rust, I was determined to find a way to keep predictable performance and minimal graph traversal as a core feature.
Moreover, it occurred to me that, while retained-mode GUI programming is famously hard to do correctly, people do (with effort) create large correct GUI applications, and nowhere in their runtime behavior can we observe anything analogous to the downstream global event trace required to combine switch
and merge
.
This suggests one of the following possibilities:
- There are useful GUI paradigms that inherently require global event tracing to implement correctly, and well known large GUIs do not express these patterns simply because they are hard to implement.
- There are (effectively) no real world GUI paradigms which require global event tracing.
- There is something analogous to global event tracing in certain existing GUI implementations, but it's not obvious precisely because it's more clearly handled with a domain-specific algorithm than a pervasive part of the reactivity machinery.
I think the last one is ultimately the answer. A helpful analogy is tracing garbage collection. With manual memory management (i.e. Rust), to implement an arbitrary algorithm, in the worse case you have to embed a garbage collector into the implementation. However in practice this doesn't really happen unless you're implementing either an interpreter for a GC language, or some kind of graph algorithm where incremental cleanup is required. While it's certainly easier to use a language with GC and let the host GC handle these cases, it seems the surface area where the GC needs to run is almost always limited to a relatively isolated problem domain. I like to think that a future Rust-like language would make it easy to use garbage collection as a library for these cases, with the tracing only happening on the domain heap and not the whole program.
Using Rust has made me moderately confident that, outside of these specialized domains, garbage collection isn't really required for general purpose programming. And actually, it seems like we should be even more confident that "event tracing" isn't required for GUIs.
General purpose programs are higher order and have extremely complex dynamics that are very hard for experienced programmers to fit in their heads, and yet their object lifetimes are still simple enough that they can almost always be handled by stack regions or reference counting (i.e. only statically known cycles). On the other hand, GUIs typically have only first order behavior, and even the most complex GUI behavior pales in comparison to the complexity in "backend" systems. Although bugs in GUIs are common (probably because they aren't using FRP!), describing the correct behavior in a modular fashion is usually not hard.
The main exception is GUIs which embed higher order behavior from some other domain, for example Excel. But of course Excel has a custom incremental update engine, it doesn't use GUI event handlers to keep track of cell dependencies.
Likewise, games have entity component systems, not elaborate webs of event handlers. (And at the level of memory management, they often use keys/indices rather than a web of references.)
In the case of reactivity, what can we achieve if we assume that "extremely dynamic event graphs" correspond to limited problem domains better handled by the application? Stay tuned for part two, where I'll discuss the space of possible semantics for keeping the core ideas from FRP, without the need to ever do global event tracing.
I've presented here a simplification of Reflex's merge
, since the actual version has some type-level complexity that is not relevant to FRP itself.
There is a trick we can use to share the work of keeping heights up to date for a chain of single-input events like map
or filter
, but this won't help us asymptotically unless we assume these make up increasing fractions of larger FRP graphs.
This is the option taken by all other existing systems I'm aware of. Please let me know if you find a counterexample!