5 min read
|
Saved February 14, 2026
|
Copied!
Do you care about this?
This article details Salesforce's successful migration of its Marketing Cloud caching infrastructure from Memcached to Redis. The team achieved this transition without downtime, handling 1.5 million cache events per second while ensuring application performance and data integrity. It highlights the challenges faced and strategies employed during the migration process.
If you do, here's more
Salesforce's Marketing Cloud Caching team, led by Paladi Sandhya Madhuri and others, successfully migrated its caching infrastructure from Memcached to Redis Cluster, achieving a remarkable 1.5 million cache events per second without any downtime. The existing Memcached system posed challenges, such as a lack of native replication and user authentication, which increased latency and stressed databases during failures. The move to Redis Cluster not only improved performance with its primary-replica replication and built-in security features but also aligned with broader modernization efforts within Salesforce's infrastructure.
The team's zero-downtime strategy required meticulous planning and execution under live production traffic. They implemented a Dynamic Cache Router to facilitate traffic shifts between Memcached and Redis, used double-writes during warm-up to ensure key availability, and applied service grouping to prevent inconsistencies. These strategies helped maintain stable cache hit rates and predictable latency throughout the migration. To address differences in Time-to-Live (TTL) semantics and key-handling behaviors between the two systems, the team developed a compatibility layer that ensured Redis functioned seamlessly with existing applications.
Handling hot keys was another challenge due to Redis Cluster's single-threaded shard model, which can cause bottlenecks when certain keys receive excessive traffic. The team had previously built detection tools in Memcached and extended this capability to Redis. They implemented a probabilistic model to track key access frequencies and refactored application code to use hybrid caching patterns. This approach, which included local in-memory caches and read replicas, alleviated pressure on Redis shards during peak traffic, ensuring stable performance across the migration.
Questions about this article
No questions yet.