6 min read
|
Saved February 14, 2026
|
Copied!
Do you care about this?
This article details the process of migrating K8ssandra clusters from old EKS clusters to new ones without downtime or data loss. It outlines the challenges faced and the custom workflow developed to ensure a smooth transition, including handling data replication and cluster configuration.
If you do, here's more
Migrating stateful workloads, particularly distributed databases like Cassandra across Kubernetes clusters, presents significant challenges, especially when aiming for zero downtime and no data loss. The authors describe their experience moving K8ssandra clusters from outdated EKS clusters to new ones. While the K8ssandra operator facilitates management of Cassandra data centers, it lacks a built-in method for migrating entire clusters, necessitating a custom migration workflow.
Initially, the team deployed a new K8ssandra operator on the new EKS clusters and scaled down the old operators to prevent conflicts. They faced issues when attempting to connect the new and old clusters, particularly due to the absence of the superuser role in the new data center's system_auth keyspace. To resolve this, they modified the keyspace on an old node and ran a repair operation to synchronize data. The rebuild of data across new nodes was complicated by the volume of data, which led to the use of their internal tool that employed Temporal workflows for managing the rebuild process safely and efficiently.
Once the first new data center was operational, adding a second data center became straightforward, utilizing K8ssandra's documentation. After ensuring data traffic was fully transitioned to the new infrastructure, the team updated connection strings and removed unnecessary configurations. They also had to update replication settings to exclude the old data centers before finally removing them from the cluster. This migration underscored the necessity for careful planning and automation when dealing with stateful systems, contrasting with the more straightforward approach typically used for stateless workloads.
Questions about this article
No questions yet.