Quit Emailing Yourself

APACHE SPARK OPTIMISATIONS. Context | by Guna Chandra Durgapu | Flipkart Tech Blog

9 min read | Saved February 14, 2026 | Copied!

spark 🤖 optimization 🤖 performance 🤖 data-skew 🤖 resource-allocation 🤖

Do you care about this?

This article outlines various strategies to optimize Apache Spark performance, focusing on issues like straggler tasks, data skew, and resource allocation. It emphasizes the importance of strategic repartitioning, dynamic resource scaling, and adaptive query execution to enhance job efficiency and reduce bottlenecks.

If you do, here's more

Apache Spark optimizations often focus on tackling performance bottlenecks, like the issue of straggler tasks caused by data skew. In one case, a single task took 45 minutes to complete while 4,799 others finished in just two. The solution involved strategic repartitioning: changing the upstream process to create 3,200 smaller files instead of 400 larger ones. This adjustment led to finer-grained tasks that fit well within executor memory, effectively eliminating disk spills. A similar approach for a problematic file involved repartitioning into 800 smaller files, enhancing read parallelism and speeding up subsequent joins.

Configuring Spark to handle the increased number of smaller tasks required careful tuning. Dynamic resource scaling was implemented to allow the Spark driver to request executors as needed. Key settings included enabling dynamic allocation and adjusting the number of executor cores to balance parallelism and memory usage. For heavy workloads, this setup allowed for up to 3,000 concurrent task slots. Maintaining high parallelism throughout the job was critical, preventing a reduction in task counts during shuffle stages.

Understanding resource utilization is essential for optimization. Each Spark application relies on balancing CPU, memory, and I/O. Performance tuning involves maximizing parallelism and minimizing data transfer. This includes configuring executors for optimal resource allocation and utilizing Adaptive Query Execution (AQE) to address issues like skewed joins. By detecting skewed partitions and splitting them for more even processing, AQE helps prevent bottlenecks that slow down the entire job. The process of fine-tuning Spark involves analyzing performance metrics and adjusting configurations accordingly.

Questions about this article

No questions yet.