Implementing Raft for Browsers with Rust and WebRTC

I try to keep a vague list of "technologies to try out" handy at all times. Usually things come and go from the list pretty quickly, but I've had a few that have been stubbornly persistent for quite a while now:

At some point, I had a bright idea: "Why not knock out a bunch of them at once?" And thus WRaft was born. It's a Raft implementation over WebRTC written in Rust for WebAssembly (amazing buzzword compliance!).

There are a couple demo apps at https://wraft0.eevans.co, and the code is open source on GitHub. It was a fun and challenging project to implement, so I thought I'd write a blog post about the experience as a sort of "postmortem".

Dramatis Personae

Here's an extremely quick summary of the technologies involved with the project:

  • Rust is more or less the only mainstream language that's memory-safe without having a garbage collector (thanks to its unique ownership model and borrow checker). This allows it to thrive in situations where garbage-collected languages struggle, including…
  • WebAssembly (or Wasm), which is a binary format for frontend web applications (and, nowadays, other applications, although that's a topic for another blog post). The idea is that Wasm programs can run at near-native speeds within a browser, which makes it an interesting choice for implementing…
  • Raft, which is a distributed consensus algorithm. Raft safely and reliably replicates data to multiple nodes (usually servers) and is the brains behind etcd and several other distributed databases. Of course, for nodes to replicate data between each other they need to be able to communicate with each other, which is a bit challenging for browser-based applications! But possible thanks to…
  • WebRTC, which is a basically-peer-to-peer protocol that allows in-browser programs to communicate with each other. The "real-world" use cases for WebRTC generally involve video and audio sharing, but the protocol also includes data channels for transferring arbitrary data between web browsers, which is what I used for WRaft.

The Project

My original idea was to build something like "etcd for browsers", i.e. a distributed peer-to-peer key-value store. Oddly enough, the Raft paper doesn't say anything about key-value stores or databases; instead it vaguely mentions a "replicated state machine". I ended up taking this pretty literally once the project progressed; the end result is that WRaft will replicate any kind of state machine (specified by a Rust trait) between browser windows.

This approach works great for React-style web applications, since they tend to be built on top of state machines for data flow. As a demo, I took the Yew TodoMVC example app and replicated the entire app state in WRaft. (Yew is something like a React/Redux clone for Rust/Wasm.) You can try out the result at https://wraft0.eevans.co/todo.

Diving into the overall architecture, a WRaft "cluster" looks something like this:

introduced.svg

A cluster always has exactly three "nodes" (i.e. browser windows) in the current implementation. The nodes find each other through a WebSocket-based introducer service (more on that later), after which they start communicating over WebRTC data channels. Each node stores data locally using local storage (I experimented a bit with IndexedDB but it seemed prohibitively slow).

As long as two nodes stay online, the whole cluster is "healthy" and can continue working; nodes that leave the cluster (for instance, if you reload the page) will do their best to rejoin the cluster and catch up. Unfortunately, a cluster can't currently recover from all nodes coming down (I might try to fix that at some point).

Cluster members can be on the same machine or different machines on a LAN (theoretically it might even work across the internet, but I haven't tested it yet). Performance seems pretty decent: with a very basic benchmark app I've seen over 2000 writes per second on Chromium, which is probably good enough for any use case I can think of 😄.

How The Project Went

It's postmortem time! Here's how things went; I hope some of this might be useful for anyone treading similar paths in the future.

Easier Than Expected: WebAssembly

I've done some "modern" frontend development in the recent past, and it's usually involved horrifically complicated setups with a ton of glue code to get everything working together. Given that WebAssembly is pretty niche and bleeding-edge, I feared the worst. But to my surprise, the WebAssembly setup experience was impressively smooth. I was at "Hello World" using wasm-pack in less time than it's taken me to make an app with create-react-app in the past.

Even better, the wasm-bindgen developers have done a fantastic job creating Rust bindings for just about the entire web browser API space, including the relatively obscure APIs like WebRTC. You can even use Rust Futures with async / await syntax using the wasm_bindgen_futures crate (which was pretty much essential for implementing Raft). A surprising number of Rust crates work flawlessly in WebAssembly (with notable exceptions like Tokio). I haven't dug into it yet, but JS↔Wasm interop also looks pretty straightforward.

The one tricky issue was dealing with JS closures, which were pervasive in the APIs I was using. The wasm-bindgen approach is quite awkward and one of the few cases I ran across where you need to do a form of manual memory management. (Apparently the gloo crate makes it a little easier, but I didn't find out about it until I was done implementing all the hard JS-related parts.)

Harder Than Expected: WebRTC

Implementing WebRTC communication, on the other hand, was much trickier than I anticipated. I didn't know much about WebRTC going into the project, but I had a vague idea that there had to be some sort of server involved for NAT traversal and whatnot.

What I didn't realize is that WebRTC sessions need a signaling server for browser windows to exchange connectivity information before they can communicate with each other. This might not be too bad, except for the fact that WebRTC signaling servers have no standard protocol or popular ready-to-use implementations.

Which means that if you want to use WebRTC, you're going to have to implement your own signaling server with some sort of ad-hoc protocol (or use a SaaS, but where's the fun in that?). WebSockets seems to be the most popular implementation choice; the best documentation resources I could find were an MDN tutorial and a fairly old YouTube talk. (Confusingly, signaling servers are unrelated to NAT traversal, which relies on standardized protocols like STUN and TURN.)

And implementing the signaling server is only half the battle. There's also the arcane WebRTC API to deal with, with its myriad of events and callbacks that arrive in hard-to-predict order and spout enormously unhelpful error messages if you do anything wrong. (To be fair a lot of the complexity comes from video/audio handling, but it's pretty hard to avoid even for data-only use-cases.)

Implementing the WebRTC-specific parts of WRaft probably ended up taking as much time as everything else combined. The end result looks something like this:

  • Every cluster gets a 128-bit session key (I probably should have called this a "cluster ID", but oh well). The session key is randomly generated when the first "node" (i.e. browser window) starts a cluster.
  • Every node has a 64-bit node id, which is basically Hash(session key, domain name) (so it will remain consistent if the user reloads the browser window but is unlikely to clash with other nodes).
  • There's a simple WebSocket-based "introducer" signaling service (written in Rust, of course) that basically just forwards SDP messages between nodes and announces when new nodes join the cluster.
  • Nodes with smaller node IDs are responsible for introducing themselves to nodes with bigger node IDs. (The "Smallest Node ID Goes First Protocol" (SNIGFP?) was the simplest way I could think of to make it work.)

Overall it's only about 250 lines of server-side code and 500 lines of browser-side code, but it's the area I'm least confident about (there are definitely bugs with Safari, and I'm pretty sure there are some lurking race conditions on other browsers as well).

To get from WebRTC data channels to something that's usable for implementing Raft, I hacked up a half-baked RPC framework that uses bincode for message serialization (I won't go into details to keep this post from getting too long, but it was fun to implement 😄).

About As Hard As Expected: Raft

Raft was originally billed as an "understandable" alternative to Paxos (which was previously considered the "go-to" distributed consensus algorithm but is notoriously difficult to use in practice). That may be true relatively speaking, but after implementing Raft I would definitely not say it's easy to understand! There are a whole lot of variables to keep track of across nodes and subtle ways things can go very wrong (I had some nasty bugs from not properly handling duplicate and out-of-order requests, for instance).

What Raft really has going for it is a fantastic descriptive paper and, even better, a formal TLA⁺ specification that's pretty readable (even for someone like me who doesn't know TLA⁺ very well). And once the algorithm has been implemented you end up with a usable real-world system (in contrast to Paxos, which is more of a basic building block that leaves out most of the practical details).

For WRaft, I stuck as close as possible to the TLA⁺ spec to avoid surprises. I didn't implement the trickier "optional" parts of Raft like cluster membership changes and log compaction, even though they would probably be essential to make WRaft useful for real-world apps. (Maybe some day if I have time…)

About As Hard As Expected: Rust

I don't really want this to turn into yet another "Thoughts on Rust" post, but I will nevertheless take this opportunity to rant a little about the Rust learning curve. Specifically: Why are there so many confusing names? Why Send and Sync instead of, I dunno, Threadsafe and Atomic? Does anyone really think Box, Rc, Arc, etc. are the most clear names for smart pointers? (How about Ptr and SharedPtr or something along those lines?) I was piling on horrifically ugly hacks for days until I found the indispensable Option.take() method (the docs are fine, but how are you supposed to find it in the first place?). FusedFutures confounded me until I realized it was "fuse" as in "fuse box", not "Asian Fusion". I lost count of how many times I misspelled mpsc. And don't even get me started on the inscrutable Pin / Unpin stuff.

(Oh and while we're at it: can we please have a moratorium on the use of static as a keyword in programming languages? It means like eleven different things across languages and at least three in Rust.)

But those are fairly minor gripes when it comes down to it, and overall I enjoy programming in Rust now that I've done my requisite time wrestling with the compiler. (My breakthrough moment came when I realized that "moving" a value kind of means that the value has been "eaten"; I've since started mentally reading compiler errors as "use of devoured value" and so forth.) There's something oddly addictive about doing a big refactor and having all the types click into place like weirdly-shaped puzzle pieces.

I've heard it said that the Rust language guides programmers to make "better" design choices; I'm not entirely sure I buy that, but certain approaches definitely end up causing fewer headaches than others. I thought I'd go through an example for interest's sake.

Shared State Sucks in Rust

To be clear, shared state sucks in every programming language, but Rust really makes you feel the pain up front! This is especially true in asynchronous code: any piece of data that could ever be mutated across async blocks has to be wrapped in a synchronization primitive like Mutex or RwLock, which makes the code get ugly really quick. (It seems a bit pedantic that this is all enforced in Wasm modules, where everything is single-threaded, but I understand the rationale.)

This came up in a big way when implementing Raft. Raft nodes basically have three jobs:

  • Respond to client requests (e.g. to read or write data)
  • Respond to RPC requests from other nodes
  • Do various "active" tasks like calling elections and sending heartbeats other nodes

The "obvious" way to implement this is to have several tasks running concurrently with some shared state. That was my initial approach, but the code was soon drowning in a sea of Arc<Mutex<T>> s and I was even running into deadlock issues. I thought there had to be a better way.

What I ended up with was (ironically) a very Go-like approach oriented around channels. Lots and lots of channels (about twenty overall in the project!). The basic idea is that there's always a single event loop that does all the heavy lifting, with client/RPC requests coming in through channels (mostly mpsc channels from the futures crate, which behave fairly similarly to Go channels).

For a concrete example, here's what the event loop looks like (more or less) for a Raft worker that's in the Follower state:

loop {
    // ...
    select! {
        // Handle RPC requests
        res = self.inner.rpc_rx.next() => {
            let (req, resp_tx) = res.expect("RPC channel closed");
            self.handle_rpc(req, resp_tx, &mut timeout);
        }
        // Handle client requests (they get forwarded to the leader)
        res = self.inner.client_rx.next() => {
            let (req, resp_tx) = res.expect("Client channel closed");
            match req {
                // ...
                ClientRequest::Apply(_) => self.forward_client_request(req, resp_tx),
            }
        }
        // Timeout for when there aren't heartbeats from the leader
        _ = timeout => {
            console_log!("Calling election!");
            return RaftWorkerState::Candidate(self.into());
        }
    }
}

Basically, a Raft follower sits there waiting for requests either from peers (the rpc_rx channel) or clients (the client_rx channel). If it doesn't receive anything from the leader for a while, it assumes the leader is down and it calls an election. (There are similar event loops for Raft workers in the Leader and Candidate states.)

The tricky part is handling request/response patterns (which is how all Raft RPCs are designed), since channels are inherently unidirectional. The solution is to bundle each request with a oneshot reply channel, which is basically a "single-use" channel that will only ever get one value. (The Rust compiler will even ensure that oneshot channels can only be used once, which is pretty cool.)

To make this all a bit more concrete, here's the code that handles RPC requests:

async fn handle(&self, req: RpcRequest<Cmd>) -> Result<RpcResponse<Cmd>, transport::Error> {
    // We make a oneshot response channel for each request
    let (resp_tx, resp_rx) = oneshot::channel();
    // self.tx is the "sender" half of the RPC channel; the "receiver" half is
    // in the Raft worker event loop
    self.tx
        .clone()
        .send((req, resp_tx)) // send both the request and the response channel as a tuple
        .await
        .expect("request channel closed");

    // Now we wait for the response, which the Raft worker should (hopefully)
    // send us
    match resp_rx.await {
        Ok(resp) => resp,
        // ...
    }
}

Thanks to the use of channels, the RPC handler is completely independent from the Raft worker; there's no shared state, which means the Raft worker task can "own" all its mutable state, which means no mutexes, which means much nicer code overall!

The big drawback I've found to this approach is error handling. You can send Result types over channels (which is what I often ended up doing), but you lose the goodness of the ? operator and you're left with the very sticky issue of how to handle errors on the channels themselves. My channel-related code ended up with a lot of unwrap() and expect(), which is far from ideal (and caused a fair number of actual bugs). (This project has given me a new appreciation for Erlang supervisors, which are are an interesting approach to dealing with these sorts of problems.)

Is this the "best" way to handle state in asynchronous Rust code? I have no idea! I haven't dug into the performance implications of using channels vs. mutexes, and experienced Rustaceans probably know some tricks to make it all a lot more elegant. (I experimented a bit with combinators like FuturesUnordered, but without much success.) But using channels definitely made it easier to wrap my head around the program control flow. Also, the code ended up looking a lot more like the Raft TLA⁺ spec which made the implementation easier 😄.

Closing Thoughts

Overall, I think Rust/Wasm is a pretty compelling platform for high-performance libraries for frontend applications. (I'm less convinced that it's a good choice for React-style SPAs, but it's surprisingly usable even for that.) The biggest limitation I've found is the lack of true multi-threading. Single-threaded asynchronous code is easy thanks to the wasm_bindgen_futures crate, but "real" threads with shared memory still seem to be out of reach. There's been some interesting activity around multi-threading but sadly it looks like the development effort has stalled.

One project I'm definitely going to keep an eye on is WASI, which is an attempt to implement standardized "system calls" for Wasm (inside and outside of the browser). I'm still not entirely clear how that would look in the context of frontend applications—a lot of extremely useful system calls (like the ones that deal with files and TCP sockets) don't really have analogs in the browser—but I'm definitely excited by the prospect of some day writing frontend apps without having to deal with crusty JavaScript-based Web APIs!

What Next?

Maybe nothing? WRaft wasn't designed with any real-world use case in mind; I was mostly just curious to see if it was possible. But I could see it potentially being useful for a peer-to-peer group chat app, multiplayer browser-based game, or something along those lines. (An interesting side-effect of using WebRTC is that all communication is end-to-end encrypted, so WRaft ended up surprisingly secure basically by accident…)

If you can think of a use case, open an issue and maybe I'll end turning it into a proper crate or NPM package.