Click any tag below to further narrow down your results
Links
Cloudflare experienced significant network failures in November and December 2025, prompting them to launch a "Code Orange: Fail Small" initiative. This plan focuses on improving the resilience of their network by implementing controlled rollouts for configuration changes, enhancing failure handling, and streamlining emergency response processes.
The article discusses recent cloud outages and their impact on businesses, emphasizing the importance of resilience in online services. It advocates for a multi-vendor strategy to enhance reliability and performance, ensuring platforms can handle unexpected disruptions without downtime.
This article discusses how the interconnectedness of cloud services creates vulnerabilities that can lead to significant outages, impacting many companies unexpectedly. It emphasizes the need for businesses to build resilience and prepare for potential failures rather than relying solely on regulatory measures.
AWS has introduced a feature that allows customers to make DNS changes within 60 minutes during service disruptions in its US East region. This response comes after repeated outages in the area, addressing the need for greater reliability, especially for businesses in regulated industries. However, the 60-minute recovery time still leaves room for significant service interruptions.