6 min read
|
Saved February 14, 2026
|
Copied!
Do you care about this?
Kubernetes v1.35 adds a new alpha feature called Restart All Containers, allowing users to restart all containers in a Pod efficiently without deleting it. This is particularly beneficial for complex applications with inter-container dependencies and helps reduce resource waste during AI/ML workloads.
If you do, here's more
Kubernetes v1.35 introduces a significant feature called Restart All Containers, which allows for in-place restarts of all containers within a Pod. Rather than deleting and recreating the Pod—a process that can be slow and resource-intensive—this feature enables a more efficient reset of a Pod's state. It's particularly beneficial for AI and machine learning workloads, where managing large-scale operations can lead to substantial resource costs, potentially exceeding $100,000 per month when dealing with failures across thousands of nodes.
The new action, RestartAllContainers, activates when a container exits under specific conditions, prompting a fast restart of the entire Pod while preserving key resources like the Pod's UID, IP address, and attached volumes. This approach eliminates the overhead associated with rescheduling Pods and allows for smoother recovery from errors, especially in scenarios where multiple containers depend on one another. For instance, if an init container corrupts the environment, a simple restart of the main application isn't sufficient; the entire initialization process must be re-executed.
Use cases highlight how this feature streamlines operations in complex environments. For machine learning jobs, it allows for quick recovery by only recreating “bad” Pods while triggering in-place restarts for healthy ones. This reduces recovery times from minutes to seconds. Similarly, it can rerun init containers when a failure corrupts shared resources, ensuring a clean state for the application on restart. To use this feature, enable the RestartAllContainersOnContainerExits gate in Kubernetes v1.35 or later, making it accessible for existing applications with minimal changes.
Questions about this article
No questions yet.