6 min read
|
Saved February 14, 2026
|
Copied!
Do you care about this?
This article discusses how Apache Hudi's Non-Blocking Concurrency Control (NBCC) improves write throughput in data lakehouses by allowing concurrent writers to append data without conflicts. It contrasts NBCC with Optimistic Concurrency Control (OCC), highlighting the inefficiencies of retries in high-frequency streaming scenarios. The piece also explains how to configure NBCC in your data pipelines.
If you do, here's more
Apache Hudi's Non-Blocking Concurrency Control (NBCC) addresses performance issues that arise from traditional Optimistic Concurrency Control (OCC) in data lakehouse environments. OCC is built on the premise that conflicts between concurrent transactions are rare, which is often not the case in modern setups where streaming ingestion and maintenance jobs run simultaneously. In high-frequency scenarios, such as frequent streaming writes coupled with long-running batch jobs, the chances of conflicts escalate. This leads to a cycle of retries that drain compute resources and significantly reduce write throughput.
NBCC changes the game by allowing concurrent writers to append updates to log files without locking each other out. Each writer generates its own log file, and Hudi determines the order of processing based on the completion time of these writes. This means that conflicts are effectively avoided, and the system can handle multiple writers seamlessly. The lock duration in NBCC is fixed and short, while OCC's lock duration increases with transaction size. This efficiency translates into minimal resource waste and improved throughput.
The mechanics behind NBCC include organizing records into file groups, where updates are routed to the same locations, and employing a TrueTime-like timestamp generation to handle clock skew between distributed writers. This ensures that all timestamps are monotonically increasing, preventing conflicts and maintaining the correct order of operations. Hudi also offers features like an LSM Tree structure for managing commit histories and flexible merging options for records, further enhancing the effectiveness of NBCC in high-concurrency environments.
Questions about this article
No questions yet.