6 min read
|
Saved February 14, 2026
|
Copied!
Do you care about this?
This article discusses an interview with Mai-Lan Tomsen Bukovec, VP of Data and Analytics at AWS, focusing on the engineering behind Amazon S3. Key topics include S3's scale, strong consistency, durability measures, and the use of formal methods to ensure system correctness.
If you do, here's more
Amazon S3 is a monumental system, capable of handling hundreds of millions of transactions per second and storing over 500 trillion objects. The engineering behind this scale is impressive; when S3 launched in 2006, it used eventual consistency, but the team managed a significant upgrade to strong consistency without compromising availability or costs. They achieved this through a replicated journal and a unique cache coherency protocol that tolerates failures while maintaining performance.
The article highlights the shift to Rust for performance-critical components, reflecting a growing engineering focus on optimizing speed and reducing latency. S3 boasts an extraordinary durability claim of 11 nines (99.999999999%), backed by continuous auditing and automatic repair systems, which assume that server failures are a constant reality. Formal methods play a critical role in ensuring code correctness, particularly in areas like consistency and cross-region replication, where automated proofs verify that updates donβt regress the system's reliability.
Correlated failures pose a significant risk, prompting S3βs architecture to design against them by replicating data across multiple availability zones and employing quorum-based algorithms. Interestingly, S3 operates with around 200 microservices, considerably fewer than systems like Uber, suggesting that effective service design can vary widely. The article also introduces S3 Vectors, a new data structure aimed at efficient searching in high-dimensional spaces, which employs precomputed neighborhoods to enhance query performance.
Questions about this article
No questions yet.