2 min read
|
Saved February 14, 2026
|
Copied!
Do you care about this?
Stripe's new data movement system allows for quick, large-scale database migrations without downtime, handling millions of queries per second. The process includes phases like data import, replication, and validation, ensuring reliability and safe rollback options during migrations. This approach is crucial for maintaining transaction integrity and customer satisfaction.
If you do, here's more
At QCon San Francisco 2025, Jimmy Morzaria from Stripe detailed their Zero-Downtime Data Movement Platform, designed for seamless database migrations at petabyte scale. This system can handle 5 million queries per second across more than 2,000 MongoDB shards, maintaining an impressive 99.9995% reliability for $1.4 trillion in annual transactions. The migration process is structured in six phases, focusing on data consistency, minimal impact on live queries, and support for various shard sizes.
The migration begins with registering target shards and key ranges, which sets the stage for data transfer. The bulk data import phase utilizes an optimized service that boosts performance tenfold by reordering inserts to align with MongoDBβs storage engine. After transferring the primary dataset, an asynchronous replication service keeps source and target shards synchronized, allowing for potential rollbacks if issues arise. Validation checks ensure data integrity before the critical traffic switch, which uses a "versioned gating" method to transition traffic seamlessly from the source to the target database within milliseconds to two seconds.
Finally, after the migration is complete, the deregistration step cleans up metadata and infrastructure. Stripe developed this platform internally to meet specific security and performance needs, as managed services couldn't provide the desired control. As their shards grew to tens of terabytes, managing data movement became essential. Morzaria pointed out that transaction abandonment rates after payment denials reach 40%, highlighting the necessity of zero-downtime migrations for maintaining customer trust and operational efficiency.
Questions about this article
No questions yet.