6 min read
|
Saved February 14, 2026
|
Copied!
Do you care about this?
Grab implemented Docker lazy loading to cut down container startup times significantly. Using eStargz and SOCI technologies, they reduced image pull times and optimized performance, leading to faster scaling and improved user experience for their data platforms.
If you do, here's more
Grab tackled the challenge of slow container startup times by implementing Docker image lazy loading through eStargz and Seekable OCI (SOCI) technologies. Their initial problem stemmed from large container images for services like Airflow and Spark Connect, which took several minutes to download on fresh nodes. The team found significant improvements in image pull times, particularly on non-cached nodes, leading to faster startup times. SOCI maintained quick application startup times, unlike eStargz, which resulted in longer delays.
In production, SOCI lazy loading improved startup times by 30-40% for Airflow and Spark Connect. This enhancement allowed Grab to manage traffic spikes more effectively and improved auto-scaling performance. The P95 startup time metric combined both image download and application startup times, providing a comprehensive view of system performance. Fine-tuning the SOCI configuration based on AWS recommendations proved essential, cutting image download times on fresh nodes from 60 seconds to 24 seconds β a 60% reduction.
The article explains how Docker lazy loading works by utilizing a remote snapshotter instead of the traditional method, where all image data must be present before container startup. With remote snapshotters, only necessary data is fetched on-demand. eStargz and SOCI are the two main formats enabling this functionality, with eStargz allowing individual file compression and random access, which is crucial for efficient loading. Overall, Grabβs approach not only optimized startup times but also provided valuable insights into container management.
Questions about this article
No questions yet.