3 min read
|
Saved February 14, 2026
|
Copied!
Do you care about this?
The article discusses the inevitability of outages and the hidden dependencies in business architectures that rely on cloud services. It emphasizes the need for robust backup plans and testing strategies, like brownouts and Chaos Monkey, to prepare for potential failures. The author argues that businesses must recognize and address these risks to avoid being blindsided by downtime.
If you do, here's more
Cloudflare recently experienced a significant outage, lasting about three hours, highlighting the reality that no service can guarantee infinite uptime. Over the past six years, Cloudflare maintained an impressive uptime of 99.99995%, but even such reliability can fail. Businesses relying heavily on these services can face unexpected downtime when their dependencies fail, often without prior warning. The article stresses the dangers of βcascadingβ dependencies, where multiple layers of reliance on hyperscalers can lead to widespread failures, as seen in a notable fintech incident that caused severe financial losses for many families.
The author argues for the necessity of backup plans in any business architecture. This doesn't mean building an entirely separate infrastructure; simple strategies can suffice. For instance, a hospital can keep crucial patient records in an offline format. The piece highlights two methods to prepare for outages: intentional brownouts, which help organizations test their error handling regularly, and the Chaos Monkey approach from Netflix, where random failures are introduced to stress-test systems. The author suggests that internet service providers should be mandated to have scheduled outages to ensure businesses develop robust backup plans, preventing catastrophic failures when unexpected outages occur.
The article underscores the tendency of both individuals and organizations to underestimate risks associated with outages. Even smaller businesses, which may not need high reliability, can inadvertently create dependencies that impact critical services like hospitals and public utilities. The takeaway is clear: organizations need to acknowledge that failures are inevitable and prepare accordingly, not just for data but for their entire infrastructure.
Questions about this article
No questions yet.