http-2 1.0.0, a fork's tale
10 Jul 2024TL;DR The http-2-next gem has been officially archived, and has been replaced by http-2 (the gem http-2-next was originally forked from) as the only direct dependency of httpx, after being merged back into the latter.
Origin story
The http-2 gem, is a (quote) pure ruby implementation of the HTTP/2 protocol and HPACK header compression. It’s “transport agnostic”, as in, it does not mess directly with sockets, instead accepting byte strings (via conn << bytes
), and allowing callbacks to be registered, in order to be called at key moments of an HTTP/2 connection management lifecycle.
# from the README
require 'http/2'
socket = YourTransport.new
conn = HTTP2::Client.new
conn.on(:frame) {|bytes| socket << bytes }
while bytes = socket.read
conn << bytes
end
Internally, it handles the head-scratching details of the HTTP/2 specs, such as binary frame encoding, stream multiplexing, header compression, and so on, so that, to the end-user, it almost feels like using an HTTP/1 parser. And it does all that, using approachable pure ruby code. It’s been around since 2014 (long before I planned maintaining an HTTP library), and I’d go as far as calling it the reference implementation of HTTP/2 in ruby.
So when I started toying around with building an HTTP application server, and ultimately came up with an HTTP client (httpx, no less), it was a no-brainer decision to pick http-2 for the HTTP/2 parts of it. Over time, I also became a contributor, authoring several patches, and ultimately gettinng to learn the head scratching details of the HTTP/2 protocol, which the gem initially abstracted for me.
A fork in the road
git forks serve the best spaghetti code
As httpx usage by the community picked up, so did the bug reports, some of them related to http-2. Being sort of involved in its development, I could see some cracks which weren’t evident in the beginning, namely spec compliance, and some performance issues here and there. http-2 being critical to my “HTTP library that could”, I set myself to solve the ones I was able to, and propose the patches upstream, in one pull request.
http-2 had a single maintainer at the time, Ilya Grigorik, which was also the author. I could see that, over time, he took more time to answer issues or review pull requests in github, sometimes months. Which can mean a lot of things, but if one could reduce it to common characteristics, it usually means that people are just busy with life and/or overwhelmed with “dayjob” responsibilities, and have very little, if any time left for interesting-but-ultimately-unpaid work.
The format (one single PR) in which the changes were proposed certainly presented a challenge, given the scope, even if each change was contextually in its own commit (I guess github pull request review flows aren’t optimized for that use-case yet). There were requests to break them down in shorter pull requests, but this was easier said than done (latter changes often depended on earlier changes), and ultimately demanded that I spent even more of my personal time in work that wasn’t receiving much of it from everyone else involved. This left the pull request stuck in a social deadlock, where the reviewer didn’t have time nor the motivation to review the full scope of changes, the requester didn’t have time nor the energy to adjust the scope of the changes, and the community didn’t have neither the time nor the context to help the requester nor the reviewer. The tool certainly didn’t help, but time was certainly the essence of the problem here.
This standstill was only worsened by having to regularly rebase changes and resolve the resulting conflicts from upstream, and a growing frustration from not being able to solve the production issues I ultimately needed to fix. I felt that, in order to progress with httpx, I needed to solve the problem of not owning its critical dependencies, so I needed to do something drastic.
So I forked http-2, and http-2-next was born. And httpx has been using since version 0.6.0
, released around November 2019.
Good times
Fred from the metaphorical shackles of collaboration, I was finally able to improve on what was missing, and then some: compliance tests became a first-class continuous integration citizen; benchmarks were run regularly; new, more performant, ruby APIs were being used, while the gem public API remained backwards-compatible. All this contributed to improved httpx performance when benchmarked against other HTTP clients.
On the other hand, the parent was receiving very little activity (less than 10 commits since the fork).
Overall, the decision to fork was an overwhelming net-positive, for httpx, despite some hiccups along the way.
But the main drawback of the decision was, nobody was watching.
Bad times
The http-2 gem was quite popular by the time the fork happened: it’s still over 800 stars even today, and still relied upon: 711 github repositories reference it, and is a dependency from some noteworthy gems, such as the ruby AWS SDK.
There have been other “forks” as well: async-http, the HTTP workhorse of the async ecosystem, used to have it as a dependency, having been replaced meanwhile by protocol-http2, which although not officially a fork, it certainly used it as reference; tipi, a fiber-based HTTP application server, still declares it as a dependency, but its author has since forked http-2 under a new name, probably with the intent of releasing it as a separate gem.
Whether these forks happened for the same reasons as mine did is irrelevant, as the outcome should be evident: duplication effort and community fragmentation. All these forks have to solve the same issues of the original implementation (spec compliance above all), while not talking to and collaborating with each other. The ecosystems using these “forks” also ultimately determine their popularity, usage, and consequently, the conditions under which a certain category of bugs is found and reported; and when reporting them, httpx gem users will use the http-2-next repo, while users of async gems will report bugs under the protocol-http2 repo.
Only 3 bug reports have been filled overall for http-2-next (almost 2 million downloads). 4 for protocol-http2 (over 5 million). Since 2019, http-2 has had 8 bug reports (over 17 million downloads overall).
The numbers above are to be taken with a grain of salt. Bugs may have been reported in the repo of the parent gem depending on them. Nevertheless, are the low bug reports correlated with higher quality / less bugs, or lower usage? There’s not a definitive answer.
What I do know is that, despite full API compatibility with the parent gem, no other gem besides httpx declares http-2-next as a direct dependency (the same happens for protocol-http2 and async-http, but there’s no API parity there). They’ve been around for at least 5 years, so why is that? Why hasn’t the community migrated to a better alternative? Are they blind?
It turns out that such a thing rarely, if ever, happens.
You got to have a “carrot”. It can be a certification. In real life, ain’t nobody got the time to validate whether your fork improves compliance legit. There may be multiple forks around claiming the same. Who’s the regulated authority ensuring specifications are held up? What, there’s no “HTTP/2 certified seal of approval”? What, you said specs run in your CI? Sure, I’ll take your word for it…
It can be convincing prominent gems using the parent gem to switch to yours. Depending on who you’re asking it from, guarantees will be asked for. And without a certification, all that is left is trust in the fork maintainer (reliance on social capital), or usage metrics, such as github repository stats (which can be inflated by maintainer popularity, proglang userbase volume, or well-timed devrel in HN) or number of gem downloads (which can be inflated by misconfigured CIs and internet bots). Now, I hate taking decisions on dependencies based on github stars as much as the next guy, but I also work and have worked in places where convincing managers to take your side in decision logs often involves looking at a table comparing options where “measure X is bigger for option 1 than option 2” where no one really understands X, but it’s important to take decisions based on data (and in some cases yes, X was github stars, and I felt dirty).
Awareness to your fork can also be achieved in other ways. You can present it at a conference. You can write a few blog posts about it (hello there!). Ultimately that requires investing more of your time, which you may not have, and ay have ultimately been the main reason for forkig (as per above, it was the case for http-2).
And even if you do all of the above, the path of least resistance will keep most on the parent gem. Despite all of its known flaws. Despite being somewhat inactive. It’s the devil they know. It’ll fail in unexpected ways, may or may not get reported back, and the fork maintainer will have no other option but to monitor the changes from the parent repo.
To sum up, while the decision to fork was an overwhelming net-positive for httpx, that’s certainly debatable for the maintainers, and the community as a whole.
A light that never goes out
Recently, a ruby AWS SDK maintainer became a committer, and started picking up outstanding issues in the http-2 repository. It eventually stumbled in my at-the-time-still-open pull request. He promptly asked me whether I wanted to resume the work. I gave him a very short version of the history described above, and suggested using http-2-next, which was turned down as being “too difficult” (probably not technically, as per what I wrote in the previous section). He was nonetheless interested in helping remove the obstacles preventing it from having been merged in the past. So I found myself considering whether it was worth doing it.
It’s been 5 years. A lot of things were against it: http-2-next source code is primarily hosted in gitlab, and integrated with gitlab CI (readers of this blog should already know I’m a gitlab fanboy. I had since adapted code style and linting rules to my own personal preferences (for instance, I prefer having double quote strings everywhere and avoid the ambiguity of dealing with both; I know, controversial). Unexisting things like RBS type signatures. The scope of changes was therefore much greater than before, which would make reviewing it even harder than before; accomplishing it would not be possible by just cherry-picking commits from one side to the other, as both main repo and fork had moved forward, and the potential for conflicts was just too high.
On the other side of the coin, there was a lot going for it. For example, there was no breaking public API change, so it’s not like a wildly different gem being merged into another, which would have held adoption back. http-2 still has a lot more community watching the repo or reporting bugs, and that would help validate the performance and compliance benefits committed to the fork even more.
So we all sat together (virtually), and came to an agreement. http-2-next was to be ported “as-is” into the “main” branch of http-2, in one giant pull request. Once reviewed, this would become the repository HEAD. Once that was done, I’d become co-maintainer, with gem push rights.
There were compromises made: one giant commit instead of multiple smaller commits meant both that http-2 maintenanceship had to accept extra changes they perhaps would not agree to otherwise (different linting rules, for example), and http-2-next maintenanceship would lose the commit history of each change from the fork (the old repo will always be there for consultation purposes though), all in the name of reducing the overhead of getting the changes upstream and publish a release. It also meant I had to say goodbye to gitlab CI and just learn how to bake the same cake with Github Actions, although some things were lost along the way; for instance, I was able to publish coverage docs in gitlab and link to them on the coverage badge, and I still don’t know how to generate coverage badges in Github Actions, nor how to make coverage docs publicly available (if someone knows how to do it, I’ll wait for your pull request:) ).
It took what it had to take, but we did it! http-2 1.0.0 was released in June 2024, and, 5 years after, httpx 1.4.0 became the first version since 0.6.0 to declare http-2 as a dependency.
Conclusion
I wrote this post as a celebration of a fork successfully being merged back into the mothership. This is not just about me, the ruby community, or my own particular gem drop in the rubygems ocean. Generally, this type of event is the exception, not the rule. In the FOSS world, forks are allowed, and encouraged. And for many good reasons. It’s empowering. It’s liberating. It can help breed innovation. But sometimes, they’re unnecessary fragmentation. Of contributors, and users. They generate effort duplication. They may lead to competting efforts in an environment where there may ultimately be no trophy at the end of the line, rather an inbox full of angry users and bug reports, or complete silence, and ultimately burnout. And when you realize it, it’s too late, or costly, to go back.
Back then, I was so obsessed with the idea of “killing” my dependencies, that I couldn’t see the bigger picture. In hindsight, if I could do things differently, I would have tried to contact Ilya in order to figure out whether I could help with reducing his burden, perhaps not being fearful of suggesting becoming a maintainer and getting a no for an answer. Essentially, just try to solve the social collaboration problem first, before jumping into implementing a technical solution.
Raise your glass to all forks, old and new, dead and gone, alive and well! May they all find their way back to the Source!