Parquet is an efficient streaming data format being utilized in a new streaming data ingestion agent built with Rust, which leverages FlightRPC and enables concurrent S3 multipart uploads into Iceberg. The agent features zero-copy memory management, high throughput, and a significant deduplication mechanism while ensuring data ordering and durability. It aims to provide scalable and high-performance data ingestion to S3 without the need for compaction, focusing on optimized data handling and throughput capabilities.
parquet ✓
s3 ✓
+ rust
streaming ✓
ingestion ✓