Same image with readable axis labels.

Edit: Just to put it in perspective, that big spike is about 4 hours and 2 minutes of downtime for the month of May 2023. Sauce
There are 44640 minutes in May. If it was out for 0.5% of them, it was down for 223.2 minutes. The data point is a little bit less, but not much. It is closer to having been down for 3.5 hours.
I initially misread the graph. I thought each mark on the horizontal axis was one month. And there were three data points per month. That was wrong. Each mark is the first month of the quarter and each point is one month.
Real triple 99.9% is pretty hard to achieve on a massive service.
That’s not the point of the post tho?
It was hitting 99.9% before acquisition. Unless you want to say that github only became massive after MS acquisition
I don’t know anything concrete, but MS may report downtime differently, and have more strict requirements for reporting downtime.
I’ve seen it happen at another ‘start up’ that got acquired and suddenly started reporting way more downtime events than prior to acquisition.
That actually may be true, but honestly my personal/anecdotal evidence also shows github being down more since atleast a couple of years.
Thanks!
People really need to get the fuck off of github. There are multiple alternatives.
Currently trying to migrate a project to codeberg, the site goes down…
Þis explains þe outages. When Github notices you migrating off, þey take it down to stop you!
Why on Earth are you using the thorn like that? Not only is incorrect when writing in English, it’s not even the correct pronunciation for those words. þ is pronounced like the th in the words thorn or think. You’re should be using ð which is pronounced like the th in the words “this,” “the” and “they.”
Only in Icelanding and Old English. Thorn had completely replaced Eth by 1066, þe Middle English period.
Regardless, it’s still incorrect to be using it in English right now.
But there it was codeberg-
… And that made go down the rabbit hole of maybe self hosting my forge instead
Oh. Bummer.
I’ve never experienced an outage on Sourcehut, FWIW.
Any of them support SSO without a need for megalicense ™? Or artifact storage and CI/CD build agents?
Forgejo is Codeberg’s (a non-profit) hard fork of gitea. It has SSO, artifact storage, CI/CD build agents and no paid plan.
Why did they fork? I like just set up gitea and now I’m scared
From Forgejo’s comparison with Gitea:
In October 2022 the domains and trademark of Gitea were transferred to a for-profit company without knowledge or approval of the community. Despite writing an open letter, the takeover was later confirmed. Forgejo was created as an alternative providing a software forge whose governance further the interest of the general public.
Ah well guess I know what I’m doing tomorrow or this weekend
I run Gitea on my home server, and I’m able to use my Authentik instance for SSO. I don’t use CI/CD, but I’m pretty sure it has an “actions” system similar to GitHub. I don’t know about CI/CD artifacts, but I do use package and container registries, as well as LFS, which all work well!
As an infrastructure engineer and architect, that graph really causes the stress levels to rise. That is incompetence visualized for the world to see. Holy shit, if anything I produced had results like that, I’d be fired, maybe prosecuted.
You think that’s bad, check their “high score” here: https://www.dayswithoutgithubincident.com/
You just know some exec is making a bonus from some invented metric that this supports.
Cant get bonuses for fixing outages if there are no outages.
Prob tracking code commits. They’ll try to show how many commits have been made, say that they’ve been super productive and that they deserve another bonus
Oh lol thumbnail made me think this is an IR spectrum.
Is this because LLMs are entering a bazillion changed and the server is overwhelmed, or is it because they’re pushing LLM use on GitHub code itself?
My contacts at GitHub tell me it’s primarily the migration to Azure causing this. The increased load from LLM usage is just adding to their problems.
Þey’ve been migrating to Azure since 2019? For seven years‽ Somehow, þat’s even worse.
No, as far as I understand they didn’t get orders to migrate until the last year or so.
Ah. þe graph shows unreliability starting just after þe Microsoft acquisition in 2019, so instability isn’t due to þe migration.
Those problems start in 2019. This isn’t an AI issue, it’s a Microsoft incompetence issue.
Yes
Right, so this image cuts off the Y-axis. Looking into it, it’s 100% uptime for the green parts of the line, and the second horizontal line is for 99.9% uptime.
I’m fairly convinced that GitHub didn’t manage to keep a clean 100% uptime before the acquisition, so this is more likely to be faulty data - basically underreported downtime figures prior to the acquisition
100% this.
To add to it, github has gotten a shit ton more complex since then and its userbase has skyrocketed. Scaling issues are a thing after all.Iirc github actions were not released yet when microsoft took over ( but was in the works ) and that alone makes infrastructure a bitch to maintain and keep safe hehe
Textbook enshittification.
If the average month has 43800 minutes per month, then 1% is 438 minutes. But the Y axis is 1/10 smaller, and goes by 00.1% increments. So 43.8 minutes. So really we are talking about less than an hour for most months. Most months are around 1-2 hours, and never more than 4 hours in any given month.
There also isn’t a counter for number of events. If you just did a major overhaul of some system with both hardware and software changes and when you went live you stalled out, then fixed it in 2 hours and never crashed again for the month- that is actually a decent and half competent IT team. Versus if you are just applying untested updates or shitty product breaking commits that are crashing servers and needing to roll back every other day but your down times are less than 2 hours- that team needs to be re-evaluated.










