GitHub experience various partial-outages/degradations

llama052 · 2026-02-02T22:02:47 1770069767

Looks like Azure as a platform just killed the ability for VM scale operations, due to a change on a storage account ACL that hosted VM extensions. Wow... We noticed when github actions went down, then our self hosted runners because we can't scale anymore.

Information

Active - Virtual Machines and dependent services - Service management issues in multiple regions

Impact statement: As early as 19:46 UTC on 2 February 2026, we are aware of an ongoing issue causing customers to receive error notifications when performing service management operations - such as create, delete, update, scaling, start, stop - for Virtual Machines (VMs) across multiple regions. These issues are also causing impact to services with dependencies on these service management operations - including Azure Arc Enabled Servers, Azure Batch, Azure DevOps, Azure Load Testing, and GitHub. For details on the latter, please see https://www.githubstatus.com.

Current status: We have determined that these issues were caused by a recent configuration change that affected public access to certain Microsoft‑managed storage accounts, used to host extension packages. We are actively working on mitigation, including updating configuration to restore relevant access permissions. We have applied this update in one region so far, and are assessing the extent to which this mitigates customer issues. Our next update will be provided by 22:30 UTC, approximately 60 minutes from now.

https://azure.status.microsoft/en-us/status

bob1029 · 2026-02-02T23:07:01 1770073621

They've always been terrible at VM ops. I never get weird quota limits and errors in other places. It's almost as if Amazon wants me to be a customer and Microsoft does not.

dgxyz · 2026-02-02T23:47:48 1770076068

Amazon isn't much better there. Wait until you hit an EC2 quota limit and can't get anyone to look at it quickly (even under paid enterprise support) or they say no.

Also had a few instance types which won't spin up in some regions/AZs recently. I assume this is capacity issues.

direwolf20 · 2026-02-03T12:15:45 1770120945

Quota limits are much less stupid than this

paulddraper · 2026-02-03T01:52:38 1770083558

The cloud isn’t some infinite thing.

There’s a bunch of hardware, and they can’t run more servers than they have hardware. I don’t see a way around that.

kavalg · 2026-02-03T06:16:25 1770099385

Indeed, but many people were led to believe so.

paulddraper · 2026-02-04T02:46:14 1770173174

I guess account limits would be surprising then :)

kavalg · 2026-02-04T05:52:31 1770184351

Perception changes with every next generation. For the last 8 years, I've been teaching at the Faculty of Mathematics and Informatics at our University. One of the courses that I lead is IoT, where students get to program on bare metal (embedded) systems. I noticed that newer (cloud) generations have a harder time accepting the constraints of embedded hardware and living with them.

ApolloFortyNine · 2026-02-03T04:38:51 1770093531

I was surprised hitting one of these limits once, but it wasn't as if they were 100% out of servers, just had to pick a different node type. I don't think they would ever post their numbers, but some of the more exotic types definitely have less in the pool.

theMMaI · 2026-02-03T08:50:17 1770108617

If you work at AWS in a technical role you can check the capacity of each pool in each AZ using an internal tool. Previously the main reason for pool exhaustion was automated jobs at the start of each working day as well as instance slotting issues (releasing a 4xl but only re-allocating a l means you now cannot slot another 4xl).

jamesfinlayson · 2026-02-04T00:24:41 1770164681

Yeah heard of this happening once too - I think someone at work was trying to spin up a few of some really old instance type.

Imustaskforhelp · 2026-02-03T10:25:12 1770114312

Really prefer Hetzner in this sense because they actually talk about limits. I recently got myself a hetzner account (after shilling it for so much, hearing positivity, I felt like it was time for me to discover it)

I wanted to try out the most cheapest option out of frugality & that was actually limited (but kudos to them that they mentioned that these servers have limits) so no worries I went and picked the 5.99 euro instead of the 3.99 euro option instead.

They also have limits option itself as a settings iirc and it shows you all the limits that are imposed in a transparent manner and my account's young so I can't request for limit increases but after some time, one definitely can.

Essentially I love this idea because essentially Cloud is just someone's else's hardware and there is no infinitium. But I feel as if it can come pretty close with hetzner (and I have heard some great things about OVH and have a good personal experience with netcup vps but netcup's payments were really PITA to setup]

direwolf20 · 2026-02-03T12:21:02 1770121262

Hetzner is a dedicated server (meaning monthly contract, 1 month setup fee and up to 1 week delivery time) company that branched out into cloud, so it's not that surprising they treat cloud a bit like that. While Amazon wants you to think they have an infinite capacity pool, and any failure to get a server is an unexpected error, Hetzner seems to not hide they have a finite number of servers in a finite number of racks, since that's how their main business works.

Imustaskforhelp · 2026-02-03T16:03:54 1770134634

I guess its understandable now the reasons why Amazon might want to do this.

Similar to hetzner, I haven't used OVH but does it also have limits or how do they follow?

Out of pure curiosity, Is there anything aside from the three hyperscaler trifecta which doesn't show limits too?

direwolf20 · 2026-02-03T17:52:55 1770141175

Nobody really shows their global limits including Hetzner. Hetzner doesn't, like, call it a secret internal error when they run out of capacity of a type.

arcdigital · 2026-02-02T23:12:29 1770073949

Agreed...I've been waiting for months now to increase my quota for a specific Azure VM type by 20 cores. I get an email every two weeks saying my request is still backlogged because they don't have the physical hardware available. I haven't seen an issue like this with AWS before...

llama052 · 2026-02-02T23:23:46 1770074626

We've ran into that issue as well, ended up having to move regions entirely because nothing was changing in the current region. I believe it was westus1 at the time. It's a ton of fun to migrate everything over!

That’s was years ago, wild to see they have the same issues.

direwolf20 · 2026-02-03T12:22:30 1770121350

Can someone explain the point of cloud like I'm a 60 year old grumpy Unix admin because you could just get a real server from another company by now. If the whole point is unlimited capacity but you don't have unlimited capacity and you're paying through the nose then why? Compliance?

briHass · 2026-02-03T14:01:30 1770127290

Compliance and tooling are a big part of it, but the places where the big public cloud providers shine is the PaaS offerings that you don't need to write yourself.

In Azure, for example, it's possible to use Entra as your Active Directory, along with the fine grained RBAC built in to the platform. On a host that just gives you VPS/DS, you have to run your own AD (and secondary backups). Likewise with things like webservers (IIS) and SQL Server, which both have PaaS offerings with SLAs and all the infra management tasks handled for you in an easily auditable way.

If you just need a few servers at the IaaS level, the big cloud platforms don't look like a great value. But, if you do a SOC2, for example, you're going to have to build all the documentation and observability/controls yourself.

jamesfinlayson · 2026-02-04T00:26:27 1770164787

At my day job, serverless stuff is great because in a small team with limited budget we don't need extra people to deal with patching, fail-overs etc.

PeterStuer · 2026-02-03T08:54:04 1770108844

Is your mental model they are running FCFS or priority allocation?

llama052 · 2026-02-02T23:20:57 1770074457

It's awful. Any other service in Azure that relies on the core systems seems to have issues trying to depend on it, I feel for those internal teams.

Ran into an issue upgrading an AKS cluster last week. It completely stalled and broke the entire cluster in a way where our hands were tied as we can't see the control plane at all...

I submit a severity A ticket and 5 hours later I get told there was a known issue with the latest VM image that would create issues with the control plane leaving any cluster that was updated in that window to essentially kill itself and require manual intervention. Did they notify anyone? Nope, did they stop anyone from killing their own clusters. Nope.

It seems like every time I'm forced to touch the Azure environment I'm basically playing Russian roulette hoping that something's not broken on the backend.

lillecarl · 2026-02-03T06:48:18 1770101298

It's nice to buy responsibility when it's upheld, else you're just trading your money for the inability to fix things.

everfrustrated · 2026-02-03T01:53:41 1770083621

How is Azure still having faults that affect multiple regions? Clearly their region definition is bollocks.

ragall · 2026-02-03T04:20:35 1770092435

All 3 hyperscalers have vulnerabilities in their control planes: they're either single point of failure like AWS with us-east-1, or global meaning that a faulty release can take it down entirely; and take AZ resilience to mean that existing compute will continue to work as before, but allocation of new resources might fail in multi-AZ or multi-region ways.

It means that any service designed to survive a control plane outage must statically allocate its compute resources and have enough slack that it never relies on auto scaling. True for AWS/GCP/Azure.

tbrownaw · 2026-02-03T04:30:38 1770093038

> It means that any service designed to survive a control plane outage must statically allocate its compute resources and have enough slack that it never relies on auto scaling. True for AWS/GCP/Azure.

That sounds oddly similar to owning hardware.

ragall · 2026-02-03T05:06:50 1770095210

In a way. It means that you can get new capacity most often, but the transition windows where a service gets resized (or mutated in general) has to be minimised and carefully controlled by ops.

everfrustrated · 2026-02-03T04:50:18 1770094218

This outage talks about what appears to be a VM control plane failure (it mentions stop not working) across multiple regions.

AWS has never had this type of outage in 20 years. Yet Azure constantly had them.

This is a total failure of engineering and has nothing to do with capacity. Azure is a joke of a cloud.

mirashii · 2026-02-03T05:02:54 1770094974

AWS had an outage that blocked all EC2 operations just a few months ago: https://aws.amazon.com/message/101925/

jamesfinlayson · 2026-02-04T00:28:15 1770164895

Yeah I remember one maybe four years ago? Existing workloads were fine but I had to go and tell my marketing department to not do anything until it was sorted because auto-scaling was busted.

everfrustrated · 2026-02-03T06:11:17 1770099077

This was the largest AWS outage in a long long time and was still constrained to a single AWS region.

Which is my point.

The same fault on Azure would be a global (all-regions) fault.

ragall · 2026-02-03T05:04:49 1770095089

I do agree that Azure seems to be a lot worse: its control plane(s) seems to be much more centralized than the other two.

flykespice · 2026-02-02T23:56:17 1770076577

Their AI probably hallucinated the configuration change

guywithabike · 2026-02-02T22:59:10 1770073150

It's notable that they blame "our upstream provider" when it's quite literally the same company. I can't imagine GitHub engineers are very happy about the forced migration to Azure.

gscho · 2026-02-03T02:18:24 1770085104

Having worked there around 2020-2021 there were many folks not happy with being forced to use azure and being forced to build GitHub actions based on azure devops. Lots of AWS usage still existed at that time but these days u bet it’s mostly gone.

madeofpalk · 2026-02-02T23:43:37 1770075817

I would imagine the majority of Github engineers there currently joined post MS acquisition.

macintux · 2026-02-03T02:14:46 1770084886

That doesn't necessarily mean they're happy about Azure as a backend.

debo_ · 2026-02-03T03:19:35 1770088775

I've been a software "engineer" for over 20 years, and my personal experience is that software engineers are basically never happy.

tbrownaw · 2026-02-03T04:21:54 1770092514

> personal experience is that software engineers are basically never happy.

Being happy means:

- you don't feel the need to automate more manual tasks (you lack laziness)

- you don't feel the need to make your system faster (you lack impatience)

- you don't feel the need to make your system better (you lack hubris)

So basically, happiness is a Sin.

teej · 2026-02-03T04:06:46 1770091606

I’ve used AWS for almost 20 years and I can tell you it’s more stable than Azure

VirusNewbie · 2026-02-03T15:31:26 1770132686

I have zero doubts.

macintux · 2026-02-03T03:46:59 1770090419

True enough. The world is never as predictable as the computers we program, and the computers we program are never as predictable as we feel they should be.

VirusNewbie · 2026-02-03T03:59:05 1770091145

Plenty of happy engineers at the other cloud. :)

homebrewer · 2026-02-03T11:26:58 1770118018

I presume you mean the Oracle cloud?

direwolf20 · 2026-02-03T12:24:57 1770121497

Nobody is happy with Oracle anything! It has some users because it is free. It has paid users because Larry Ellison bribed the government. Nobody would choose it voluntarily.

VirusNewbie · 2026-02-03T15:31:56 1770132716

No, gcp. Was a happy customer for many years, now I work there.

kasey_junk · 2026-02-03T11:12:55 1770117175

A bunch less today than a year ago.

pydry · 2026-02-03T11:50:52 1770119452

Autonomy, decent pay, non toxic environment and non bullshit job.

It isnt actually all that much but most devs who have all of these I've come across are happy.

jamesfinlayson · 2026-02-04T00:30:26 1770165026

Agreed. I've had this more often than not, and while every job has its little gripes, if I have those things the rest is well, just part of the job.

tbrownaw · 2026-02-03T04:37:21 1770093441

> notable that they blame "our upstream provider" when it's quite literally the same company

As in why don't they mention Azure by name?

Or as in there shouldn't be isolated silos?

mrweasel · 2026-02-03T13:47:04 1770126424

A few years ago I talked to an developer advocate for Azure. I wanted to know why it took for ever when you wanted a new public IP. My take was that it felt like they went out on the internet to look for an IP to purchase from a 3rd. party. The answer I got was that do to the silos within Microsoft it might as well be a 3rd party supplier. The slowness is exactly because IPs are/were a managed by another Microsoft entity, who views any interaction, even within the company, as hostile.

OJFord · 2026-02-03T10:53:40 1770116020

I get your point, but it just sounds a bit funny when it's an artefact of corporate structure that it's true.

Like imagine if AWS was composed of separate companies for different services - Fargate was an Heroku acquisition say - and then they all went down and blamed their 'upstream provider' because they can't work without say VPC or EC2 availability.

I think that's all GP meant, it just reads a bit funny, not that it's wrong.

elAhmo · 2026-02-03T13:34:12 1770125652

Yup, they didn't mention it by name, it was stated as "our upstream provider".

b00ty4breakfast · 2026-02-03T02:37:55 1770086275

something about antifreeze in the dogfood

fbnszb · 2026-02-02T22:11:58 1770070318

As an isolated event, this is not great, but when you see the stagnation (if not downwards trajectory) of GitHub as a whole, it‘s even worse in my opinion.

edit: Before someone says something. I do understand that the underlying issue is some issue with Azure.

estimator7292 · 2026-02-03T00:12:36 1770077556

It really doesn't even matter why it failed. Shifting blame on Azure doesn't change the fact that GitHub is becoming more and more unreliable.

I don't get how Microsoft views this level of service as acceptable.

Ronsenshi · 2026-02-03T02:16:26 1770084986

Doesn't seem like Microsoft managers care - it's not their core business, so any time anyone complains about issues with GitHub they probably think something along the line of "peasants whining again".

Must be nice to be a monopoly that has most of the businesses in the world as their hostages.

Aeolun · 2026-02-03T04:22:58 1770092578

At one point Gitlab seemed like it wanted to compete, but then they killed all the personal and SMB plans, and now they’re just out of the picture for a lot of people. Their team plan is more expensive that GH’s enterprise plan.

hirako2000 · 2026-02-03T07:07:33 1770102453

IPO and quarterly demand for profit.

Gitlab was generous first, to rise as a valid alternative to GitHub. They never got the comminity aspect right, perhaps aiming for profitability with a focus on the runners instances which is how they make money.

With profitability, the IPO made sense.

GitHub probably had a different strategy..keep it generous, get the entire open source community, keep raising money and one day someone will buys us out for billions. We we are, Microsoft goal is to capture the community, it works. It's sticky.

direwolf20 · 2026-02-03T12:26:01 1770121561

Codeberg is a nonprofit community project aiming to replicate that. You can use it today.

hirako2000 · 2026-02-03T23:48:52 1770162532

I've used it, it's great, more like what GitHub was meant to be.

There is Forgejo. I find it more stable, I self host that. It never suffered an outage in 2 years that I had it running and is faster than GitHub.

direwolf20 · 2026-02-03T23:56:44 1770163004

Codeberg is a public instance of the Forgejo software, which you can also host yourself.

shiroiuma · 2026-02-03T07:23:10 1770103390

Yes, but this also means that countless open-source projects are in what appears to be a precarious position. What if MS one day decides all this free hosting isn't worth it, and just cuts it off? There aren't really any alternatives I know of, except bad ol' Sourceforge I guess.

llama052 · 2026-02-02T22:15:01 1770070501

Sadly Github moving more into Azure will expose the fragility of the cloud platform as a whole. We've been working around these rough edges for years. Maybe it will make someone wake up, but I don't think they have any motivation to.

Imustaskforhelp · 2026-02-03T10:34:58 1770114898

I really like codeberg if your project is licensed in an Open license.

One of the reasons I still use github is that I have starred quite a lot of projects and had to make an account initially to star a project. (I used to have bookmarks beforehand but I wanted to support author in a minor way :] and also github being de-facto & I wanted to talk to some projects which had issues which I wanted to create/discuss)

Another minor point is that Github actions are more generous than Codeberg's actions equivalent.

I believe hosting own Codeberg ie. Forejo (which is a gitea fork)/ gitea is actually easy. I once hosted them on my android phone using termux and on servers. Really liked the idea of having essentially github at my pockets.

For Gists [which is something that I like using a lot personally]. I found the idea of opengists really interesting as well. one minor complaint with opengists is that I love the comment part of gists which is an open issue in opengists but its not implemented yet. Wish it could be implemented.

Regarding losing bookmarks, I actually have a custom tampermonkey script in a private gist which shows a star button which essentially moves my bookmarks to some gist in a json format so as to not lose them ever again essentially.

fbnszb · 2026-02-03T14:14:29 1770128069

Personally, I run my own Forgejo instance for the private repos I actually care about. But it's basically impossible to not have a GitHub account right now. I use "Refined GitHub" to make the UI somewhat usable.

cluckindan · 2026-02-02T22:19:05 1770070745

> Azure

Which is again even worse.

bandrami · 2026-02-03T02:00:11 1770084011

In the Bad Old Days before Github (before Sourceforge even) building and package sucked because of the hundred source tarballs you had to fetch, on any given day 3 would be down (this is why Debian does the "_orig" tarballs the way they do). Now it sucks because on any given day either all of them are available or none of them are.

fishgoesblub · 2026-02-02T23:33:22 1770075202

Getting the monthly GitHub outage out of the way early, good work.

herpdyderp · 2026-02-03T03:18:37 1770088717

Unfortunately that won’t clear up the weekly GitHub outages

jamesfinlayson · 2026-02-04T01:44:22 1770169462

What time zone are you in? In Australia I rarely have issues with GitHub (one in the last year maybe).

imglorp · 2026-02-03T20:56:22 1770152182

Monthly what now? Daily would be more accurate.

There were 25 incidents in January and 15 in December.

spooneybarger · 2026-02-02T23:49:14 1770076154

well played sir. well played.

booi · 2026-02-02T22:07:21 1770070041

Copilot being down probably increased code quality

maddmann · 2026-02-02T22:34:29 1770071669

This is why I come to hacker news. Sanity check on why my jobs are failing.

nialv7 · 2026-02-02T23:59:18 1770076758

better luck with your next job :)

bhouston · 2026-02-02T22:51:13 1770072673

Exactly same reason why I posted. My Github Actions jobs were not being picked up.

Zanfa · 2026-02-03T15:16:29 1770131789

Looks like Github Actions is having another bad day today as of an hour ago, but status page is not yet updated.

elcapitan · 2026-02-03T15:42:51 1770133371

Yep can confirm, waiting 10-15 minutes for actions to run

whh · 2026-02-03T15:46:27 1770133587

~20 minute delay so far from our perspective, looks to be increasing.

Their status page seems to think everything's A-OK.

elcapitan · 2026-02-03T16:07:53 1770134873

Copilot is probably waiting for a time slot to vibecode a fix as well :D

falloutx · 2026-02-02T22:25:34 1770071134

50% of code written by AI, now let the AI handle this outage.

anematode · 2026-02-02T22:33:48 1770071628

Catch-22, the AI runs on Azure...

maddmann · 2026-02-02T22:46:38 1770072398

Ai deploys itself to aws, saving GitHub but destroying Microsoft’s cloud business — full circle

Andrex · 2026-02-03T14:50:39 1770130239

"Whoever wins, we lose." - Poster for Aliens vs. Predator

toastal · 2026-02-03T07:11:43 1770102703

There’s never been a better time to migrate to another forge or at least have a self-hosted bare repository to handle outages.

Lwrless · 2026-02-03T08:10:25 1770106225

Recently my download speed from GitHub releases has decreased dramatically. But I'm sure they will be fixing that with Claude Code soon... Will they?

pluralmonad · 2026-02-03T14:23:53 1770128633

On what OS have you noticed this? Very in character for microsoft to artificially slow non-windows downloads. Then again, my apt upgrades on Debian have been dog slow lately...

Lwrless · 2026-02-03T14:44:26 1770129866

I was mostly on macOS. It seems to me that there's an issue with GitHub's CDN or routing.

Andrex · 2026-02-03T14:49:59 1770130199

That's surely a feature, not a bug.

suriya-ganesh · 2026-02-02T22:49:47 1770072587

It is always a config problem. somewhere somplace in the mess of permissioning issues.

olcarl75 · 2026-02-03T18:32:05 1770143525

ah well, with agentic coding relying more and more on worktrees, I think it's about time to revive my good and old SVN server

rvz · 2026-02-02T22:53:31 1770072811

Tay.ai and Zoe AI Agents probably running infra operations at GitHub and still arguing about how to deploy to production without hallucinating a config file and deploying a broken fix to address the issue.

Since there is no GitHub CEO, (Satya is not bothered anymore) and human employees not looking, Tay and Zoe are at the helm ruining GitHub with their broken AI generated fixes.

anematode · 2026-02-03T00:48:02 1770079682

Hey, let them cook.

deepsun · 2026-02-03T05:48:41 1770097721

Hey, does the stock go up or down?

levkk · 2026-02-02T23:05:23 1770073523

This happens routinely every other Monday or so.

locao · 2026-02-02T23:33:12 1770075192

I was going to joke "so, it's Monday, right?" but I thought my memory was playing tricks on me.

re-thc · 2026-02-02T23:04:03 1770073443

Jobs get stuck. Minutes are being consumed. The problem isn't just it being unavailable.

jmclnx · 2026-02-02T21:36:20 1770068180

With linkedin down, I wonder if this is an azure thing ? IIRC github is being moved to azure, maybe the azure piece was partially enabled ?

CubsFan1060 · 2026-02-02T22:02:17 1770069737

It is: https://azure.status.microsoft/en-us/status

"Impact statement: As early as 19:46 UTC on 2 February 2026, we are aware of an ongoing issue causing customers to receive error notifications when performing service management operations - such as create, delete, update, scaling, start, stop - for Virtual Machines (VMs) across multiple regions. These issues are also causing impact to services with dependencies on these service management operations - including Azure Arc Enabled Servers, Azure Batch, Azure DevOps, Azure Load Testing, and GitHub. For details on the latter, please see https://www.githubstatus.com."

focusgroup0 · 2026-02-02T23:54:26 1770076466

Will paid users be credited for the wasted Actions minutes?

jokoon · 2026-02-03T06:34:36 1770100476

Feels like acquiring GitHub was another way to hurt open source projects

direwolf20 · 2026-02-03T13:51:40 1770126700

Microsoft loves open source projects, as long as they help Microsoft make money.

DANmode · 2026-02-03T07:41:35 1770104495

How are they demonstrating that?

Or, if part of a future plan: how?

ChrisArchitect · 2026-02-02T21:54:59 1770069299

Some more earlier: https://news.ycombinator.com/item?id=46860544

WhereIsTheTruth · 2026-02-03T10:48:34 1770115714

If you are still using GitHub, you have failed

De-risk yourself from Microsoft

ares623 · 2026-02-03T10:47:21 1770115641

If you look at the history, they have as many incidents as there are days since the year started.