Click any tag below to further narrow down your results
Links
This article explains Spark Declarative Pipelines (SDP), a framework for creating data pipelines in Spark. It covers key concepts like flows, datasets, and pipelines, along with how to implement them in Python and SQL. The guide also includes installation instructions and usage of the command line interface.
Kedro is an open-source Python framework designed for creating production-ready data science and data engineering pipelines. It emphasizes software engineering best practices to ensure reproducibility, maintainability, and modularity, and offers various features like a project template, data catalog, and flexible deployment options. The framework supports collaboration among teams with diverse software engineering knowledge and is maintained by a growing community of contributors.