Cloudflare suffered a service outage on November 18, 2025. The outage was triggered by a bug in generation logic for a Bot Management feature file causing many Cloudflare services to be affected.
Honestly it annoys me how much the well has been poisoned with rust that we’re even talking about the language here. There is so much focus on rust that we’re not even talking about how they literally couldn’t tell the difference between their software crashing in production and a ddos attack.
They had no visibility into their runtime environment, and from my understanding of the Blogpost, didn’t even look into the possibility until the entire cluster went down from this bad config.
Like, even assuming they did input validation, what should the clickhouse services do when they’re fed an invalid config? I’d argue the only sensible thing would be to refuse to start. But it seems like crashing wasn’t being detected at all.
Honestly it annoys me how much the well has been poisoned with rust that we’re even talking about the language here. There is so much focus on rust that we’re not even talking about how they literally couldn’t tell the difference between their software crashing in production and a ddos attack.
They had no visibility into their runtime environment, and from my understanding of the Blogpost, didn’t even look into the possibility until the entire cluster went down from this bad config.
Like, even assuming they did input validation, what should the clickhouse services do when they’re fed an invalid config? I’d argue the only sensible thing would be to refuse to start. But it seems like crashing wasn’t being detected at all.