5 min read
|
Saved February 14, 2026
|
Copied!
Do you care about this?
This article discusses the recent release of Apache Iceberg V3, highlighting its key features like deletion vectors and row lineage. It evaluates how well different engines support V3, noting that while some engines are ready, others, including popular ones like Athena and Trino, are not yet compatible.
If you do, here's more
Apache Iceberg V3 introduces significant features for the lakehouse ecosystem, including efficient row-level deletes, row lineage, a new semi-structured data type called VARIANT, geospatial data types, and foundational table-level encryption. While these capabilities are promising, the readiness of V3 varies by engine. Some, like Apache Spark and Flink, have robust support for V3, while others, including Amazon Athena and OSS Trino, lag behind, complicating broader adoption.
Deletion vectors in V3 are well-supported, making deletes more manageable, especially for workloads that rely heavily on them. Row lineage enhances incremental processing and debugging across data pipelines. VARIANT, a highly anticipated feature, allows for native semi-structured data handling, but its support is still partial. Geospatial types are also introduced, though theyβre mostly available through extensions. Encryption features exist but lack widespread implementation.
The current landscape shows that while Spark and Flink users can adopt V3 with confidence, organizations relying on engines like Athena face significant barriers due to lack of support. Many other systems, including Snowflake and various Python libraries, also don't fully support V3 yet. For teams considering V3, it's essential to assess the compatibility of their primary engines and downstream readers before committing to the new format.
Questions about this article
No questions yet.