GitHub Outages Since Microslop Acquisition

snoons@lemmy.ca · 3 days ago

GitHub Outages Since Microslop Acquisition

AlfalFaFail@lemmy.ml · edit-2 2 days ago

Same image with readable axis labels.

Edit: Just to put it in perspective, that big spike is about 4 hours and 2 minutes of downtime for the month of May 2023. Sauce

MrEff@lemmy.world · 2 days ago

There are 44640 minutes in May. If it was out for 0.5% of them, it was down for 223.2 minutes. The data point is a little bit less, but not much. It is closer to having been down for 3.5 hours.

AlfalFaFail@lemmy.ml · 1 day ago

I initially misread the graph. I thought each mark on the horizontal axis was one month. And there were three data points per month. That was wrong. Each mark is the first month of the quarter and each point is one month.

BigDiction@lemmy.world · 2 days ago

Real triple 99.9% is pretty hard to achieve on a massive service.

just_an_average_joe@lemmy.dbzer0.com · 2 days ago

That’s not the point of the post tho?

It was hitting 99.9% before acquisition. Unless you want to say that github only became massive after MS acquisition

BigDiction@lemmy.world · 2 days ago

I don’t know anything concrete, but MS may report downtime differently, and have more strict requirements for reporting downtime.

I’ve seen it happen at another ‘start up’ that got acquired and suddenly started reporting way more downtime events than prior to acquisition.

just_an_average_joe@lemmy.dbzer0.com · 2 days ago

That actually may be true, but honestly my personal/anecdotal evidence also shows github being down more since atleast a couple of years.

snoons@lemmy.ca · 2 days ago

Thanks!

FrankFrankson@lemmy.world · 3 days ago

People really need to get the fuck off of github. There are multiple alternatives.

RustyNova@lemmy.world · 3 days ago

Currently trying to migrate a project to codeberg, the site goes down…

Ŝan • 𐑖ƨɤ@piefed.zip · 3 days ago

Þis explains þe outages. When Github notices you migrating off, þey take it down to stop you!

Tenniswaffles@lemmy.blahaj.zone · 2 days ago

Why on Earth are you using the thorn like that? Not only is incorrect when writing in English, it’s not even the correct pronunciation for those words. þ is pronounced like the th in the words thorn or think. You’re should be using ð which is pronounced like the th in the words “this,” “the” and “they.”

Ŝan • 𐑖ƨɤ@piefed.zip · 2 days ago

Only in Icelanding and Old English. Thorn had completely replaced Eth by 1066, þe Middle English period.

Tenniswaffles@lemmy.blahaj.zone · 2 days ago

Regardless, it’s still incorrect to be using it in English right now.

RustyNova@lemmy.world · 3 days ago

But there it was codeberg-

… And that made go down the rabbit hole of maybe self hosting my forge instead

Ŝan • 𐑖ƨɤ@piefed.zip · 2 days ago

Oh. Bummer.

I’ve never experienced an outage on Sourcehut, FWIW.

ramble81@lemmy.zip · 3 days ago

Any of them support SSO without a need for megalicense ™? Or artifact storage and CI/CD build agents?

Anafabula@discuss.tchncs.de · 3 days ago

Forgejo is Codeberg’s (a non-profit) hard fork of gitea. It has SSO, artifact storage, CI/CD build agents and no paid plan.

Fiery@lemmy.dbzer0.com · 1 day ago

Why did they fork? I like just set up gitea and now I’m scared

Anafabula@discuss.tchncs.de · 1 day ago

From Forgejo’s comparison with Gitea:

In October 2022 the domains and trademark of Gitea were transferred to a for-profit company without knowledge or approval of the community. Despite writing an open letter, the takeover was later confirmed. Forgejo was created as an alternative providing a software forge whose governance further the interest of the general public.

Fiery@lemmy.dbzer0.com · 1 day ago

Ah well guess I know what I’m doing tomorrow or this weekend

epicshepich@programming.dev · 3 days ago

I run Gitea on my home server, and I’m able to use my Authentik instance for SSO. I don’t use CI/CD, but I’m pretty sure it has an “actions” system similar to GitHub. I don’t know about CI/CD artifacts, but I do use package and container registries, as well as LFS, which all work well!

4grams@awful.systems · 3 days ago

As an infrastructure engineer and architect, that graph really causes the stress levels to rise. That is incompetence visualized for the world to see. Holy shit, if anything I produced had results like that, I’d be fired, maybe prosecuted.

spartanatreyu@programming.dev · 1 day ago

You think that’s bad, check their “high score” here: https://www.dayswithoutgithubincident.com/

ivanvector@piefed.ca · 3 days ago

You just know some exec is making a bonus from some invented metric that this supports.

MrEff@lemmy.world · 2 days ago

Cant get bonuses for fixing outages if there are no outages.

borth@sh.itjust.works · 3 days ago

Prob tracking code commits. They’ll try to show how many commits have been made, say that they’ve been super productive and that they deserve another bonus

Sir_Premiumhengst@lemmy.world · 1 day ago

Oh lol thumbnail made me think this is an IR spectrum.

Tar_Alcaran@sh.itjust.works · 3 days ago

Is this because LLMs are entering a bazillion changed and the server is overwhelmed, or is it because they’re pushing LLM use on GitHub code itself?

jonathan@piefed.social · 3 days ago

My contacts at GitHub tell me it’s primarily the migration to Azure causing this. The increased load from LLM usage is just adding to their problems.

Ŝan • 𐑖ƨɤ@piefed.zip · 3 days ago

Þey’ve been migrating to Azure since 2019? For seven years‽ Somehow, þat’s even worse.

jonathan@piefed.social · 2 days ago

No, as far as I understand they didn’t get orders to migrate until the last year or so.

Ŝan • 𐑖ƨɤ@piefed.zip · 2 days ago

Ah. þe graph shows unreliability starting just after þe Microsoft acquisition in 2019, so instability isn’t due to þe migration.

Voroxpete@sh.itjust.works · edit-2 3 days ago

Those problems start in 2019. This isn’t an AI issue, it’s a Microsoft incompetence issue.

Jason2357@lemmy.ca · 3 days ago

VibeSurgeon@piefed.social · 3 days ago

Right, so this image cuts off the Y-axis. Looking into it, it’s 100% uptime for the green parts of the line, and the second horizontal line is for 99.9% uptime.

I’m fairly convinced that GitHub didn’t manage to keep a clean 100% uptime before the acquisition, so this is more likely to be faulty data - basically underreported downtime figures prior to the acquisition

DacoTaco@lemmy.world · edit-2 3 days ago

100% this.
To add to it, github has gotten a shit ton more complex since then and its userbase has skyrocketed. Scaling issues are a thing after all.

Iirc github actions were not released yet when microsoft took over ( but was in the works ) and that alone makes infrastructure a bitch to maintain and keep safe hehe

Diplomjodler@lemmy.world · 3 days ago

Textbook enshittification.

MrEff@lemmy.world · edit-2 2 days ago

If the average month has 43800 minutes per month, then 1% is 438 minutes. But the Y axis is 1/10 smaller, and goes by 00.1% increments. So 43.8 minutes. So really we are talking about less than an hour for most months. Most months are around 1-2 hours, and never more than 4 hours in any given month.

There also isn’t a counter for number of events. If you just did a major overhaul of some system with both hardware and software changes and when you went live you stalled out, then fixed it in 2 hours and never crashed again for the month- that is actually a decent and half competent IT team. Versus if you are just applying untested updates or shitty product breaking commits that are crashing servers and needing to roll back every other day but your down times are less than 2 hours- that team needs to be re-evaluated.