1 link tagged with all of: data-processing + reproducibility + compute-catalog
Click any tag below to further narrow down your results
Links
Xorq is a batch transformation framework that integrates with multiple engines like DuckDB, Snowflake, and DataFusion, allowing for reproducible builds and efficient data processing. It features a YAML-based multi-engine manifest, compute catalog, and supports scikit-learn for machine learning pipelines. Xorq focuses on deterministic batch executions, enabling easy sharing and serving of compute artifacts across teams.