2 links tagged with all of: query-optimization + metadata + indexing
Click any tag below to further narrow down your results
Links
This article explains Hudi's advanced indexing features, focusing on record and secondary indexes for efficient query processing. It also covers expression indexes for transformed queries and the async indexing process that allows background index building without disrupting operations.
External indexes, metadata stores, catalogs, and caches can significantly enhance query performance on Apache Parquet by allowing efficient data retrieval without the need for extensive reparsing. The blog discusses how to implement these components using Apache DataFusion to optimize custom data platforms for specific use cases. It also highlights the advantages of Parquet's hierarchical data organization and its compatibility with various indexing strategies.