Click any tag below to further narrow down your results
Links
This article explains the new asynchronous instant time generation feature in Apache Hudi 1.1 for Flink writers, which allows for non-blocking requests for new instants. This improvement enhances throughput by enabling writers to continue processing without waiting for previous transactions to complete. It also outlines how this feature interacts with Hudi’s file slicing and timeline management.
This article explains Hudi's advanced indexing features, focusing on record and secondary indexes for efficient query processing. It also covers expression indexes for transformed queries and the async indexing process that allows background index building without disrupting operations.
This article discusses how Apache Hudi's Non-Blocking Concurrency Control (NBCC) improves write throughput in data lakehouses by allowing concurrent writers to append data without conflicts. It contrasts NBCC with Optimistic Concurrency Control (OCC), highlighting the inefficiencies of retries in high-frequency streaming scenarios. The piece also explains how to configure NBCC in your data pipelines.
Apache Hudi 1.1 introduces a pluggable table format framework that supports multiple storage formats, enhancing flexibility in data management. The release also includes indexing improvements, faster clustering, and a new storage-based lock provider for better concurrency. These updates aim to make Hudi tables more efficient and easier to operate.