Turborepo Rewrite - The Right Choice?

March 4, 2024 (9mo ago)

A balancing act

There's been a lot of talk about “hype-driven development” lately, often targeting the decision to “just rewrite it in Rust.” A lot of the criticism about these kinds of decisions comes down to a pretty simple comparison: iteration speed vs. quality and maintainability¹. These counterpoints are often intertwined, and the decision to choose a language is a highly nuanced one that takes into account tradeoffs that people external to the development team may not have access to. While the idea that the feature set - along with the speed to market - of a product is often more important than the internal developer experience or long-term maintainability of a code-base is a tempting proposition, it's often not truly the case.

Theo recently released a video about the Turborepo rewrite from GoLang to Rust. In that video, he mentioned that the rewrite was not a great idea because in the first 4 months of the Turbo project, it had the same - if not better - functionality that it does now after spending 15+ months transitioning to Rust. His point is that it's often better to iterate quickly on a product rather than spending time improving the internal codebase and rewriting a product “for minimal improvement in speed or efficiency.” (Not verbatim.)

He has a point. There's a lot of history and evidence to back this up. The engineering culture for product at companies like Meta and Google hold strongly to that belief, and many engineers agree with that sentiment. “The Lean Startup” by Eric Ries praises the idea, saying that it's better to get early feedback on a product and iterate quickly rather than optimizing for a future that may not exist when you get there. YCombinator's Startup School also re-iterates that idea in several of their videos and articles. These are all fair points, and make a lot of sense when you consider where they're coming from.

So, if I agree, why am I even writing this?

Because it's not always that simple. For certain products, like many web and mobile applications, this makes a lot of sense, and is often the right approach. But when you consider the set of requirements that a project like Turborepo has, other considerations come into focus.

Turbo is an infrastructure tool - according to their website “Turbo is an incremental bundler and build system optimized for JavaScript and TypeScript, written in Rust.” Planning the development of infrastructure tools often has to consider many different conflicting requirements. Much like making the impossible decision between availability and consistency when designing distributed systems, infrastructure tools must make an impossible decision among requirements for feature iteration, API consistency, avoidance of undefined behaviors, cross-platform stability, and much, much more. Often, a balance can be struck in complex software systems that nicely solves for most of the requirements. But sometimes, concessions must be made and something has to give. In the case of Turborepo, that something was building a more maintainable system at the expense of development time, adding new features, and the quick availability of the tool.

I want to be clear: I'm not trying to say that every project needs to follow suit, nor am I saying that Rust is the best language for every project. If you're building a full stack PWA, and need fine-grained reactivity, you might want to use React and Typescript where the backend and frontend are tightly coupled (shoutout T3). It's maybe not the place for a language like Rust, or Zig, or Go. But if you're building a tool that's going to be used on many hundreds of thousands (millions?) of machines, across many projects, where a developer might not have the latest version (because let's be real, we all forget to update sometimes), stability is extremely important. Breaking API changes cost a lot of money, time, frustration, and can lead to massive churn. So, often, after an architectural decision has been made, even if it's found to be substandard, projects need to stay the course.

We've seen this in so many cases. For example, in the case of Rust's async model, even after all the time spent on it, there are architectural changes that could have been made that would have prevented a lot of headaches. According to Boatsthe intention was that users writing “normal async Rust” would never have to deal with the Pin type at all, but there have been notable exceptions. Almost all of these would be fixable with some syntax improvements. One such exception that's really bad is that you need to pin a future trait object to await it. This was an unforced error that would now be a breaking change to fix."

Rust, btw

Full disclosure: I am a Rust developer who's been using the language in production systems for several years. But even though Rust is my language of choice, my company is built on a NextJS and Typescript stack, with only some critical components written in Rust. Why? Because like I mentioned at the top of this article, iteration speed does matter, and in our case, given our requirements, we needed early feedback and short iteration cycles.

The interesting thing about the Turborepo case is that even though Go was a sufficiently powerful language, the expressiveness of Rust enabled the team to model the required behavior in a much more maintainable and consistent manner. One of the (albeit simple) examples they mentioned in their blog post is the model of the package graph.²

In the Go implementation, package names and the workspace root were stored as strings. The workspace root was designated a “magic string” //, and all subdirectories were modeled in relation to that magic string.

In the Rust implementation, package names and the workspace root are modeled as an enum. Rust's Enum type provides a powerful pattern matching mechanism that allowed them to define separate variants for the workspace root and for subdirectories:

enum PackageName {
  Root,
  Other(String)
}

As they go on to say, “not only is this more efficient, it also ensures correctness.” Rust's Enum types greatly facilitate maintainability, and modeling this sort of thing super easy to read and understand. I believe that for Turborepo, efficiency and correctness are some of the more important requirements.

There are other considerations that I don't want to go into in fear of an overly-long post, but are detailed below:

  • FFI with languages like Go can be pretty difficult. Go's concurrency model and green threads make concurrency “simple” when working within the language, but are harder to manage across inter-language communication. This is by design - you can call C from Go, but it's much harder to call Go from C.
  • The abstraction layers that Go provides over system calls are convenient, but for a bundler and build system like Turbo, low-level control over things like the threading model and other system functions is important. (And not to Rust all over you, but cfg attributes make platform-dependent compilation much more maintainable and the code separation much more understandable.)
  • Garbage collection is a wonderful, sneaky double-edged sword. Go occupies a space where avoiding the GC is a bit convoluted, requiring manual minimization of heap allocations.
  • Enough yapping about Go - what about Zig? Well, although Bun is written in Zig (and doing phenomenally might I add), Zig is at v0.11.³ Bun is making a juggernaut bet on the stability of the API and the progress of the language. Zig v0.12 introduces many breaking changes, including overhauls of the std library. Zig's async features have also regressed - they are not present in v0.11, and likely will not be present in v0.12 either. I love Zig, it's one of my favorite languages, but it's nevertheless a risk.⁴

Conclusion

The point I'm trying to make in this long-winded post is this: in some cases - and I think the Turbo example is one of them - error avoidance and explicit behavior modeling are the driving factor behind architectural or language decisions.

Another tangential, semi-unrelated, but still important point here is that language choice has an effect on developer experience when working on a project. Rust is fun to write, and working in a language you enjoy makes a project more rewarding to work on. And let's be real, hype-driven development is a very real thing. The “rewritten in Rust” phenomenon is funny on Twitter/X, but it's also a valid marketing strategy.

The main idea I want to leave you with is that choice of language, tools, architecture, etc., is a highly nuanced topic, and I don't think there's really a perfectly right answer. Engineering teams need to carefully assess requirements and make decisions according to the project's needs and long-term viability. No one can prescribe when and where to use any language; every software system is different and comes with it's own unique set of challenges. But that's part of the fun of it all.


¹ This is a bit of an unfair point, because this isn't an argument against quality, and often iterations on features are what actually improve long term quality for end-users. But speed does sometimes come at the cost of quality, and hastily made decisions can actually hurt you down the line.

² There are other examples from the Turbo blog post that showcase where Rust shines in comparison to Go, but the directory model shows this most clearly in my opinion.

³ Zig v0.12 is coming soon!!! Zig v0.12!

⁴ My unsolicited opinion is that Zig is probably an excellent bet, given that it's is a 501(c)(3) and Andrew Kelley and the Zig team are incredible engineers. This might have been a great choice for Turbo as well. We'd have to ask Jared Sumner and the Bun team what their decision and thought process was.