2 links tagged with all of: data-engineering + data-lakehouse
Click any tag below to further narrow down your results
Links
This article explains how Apache Hudi manages schema evolution in data lakehouses, allowing for seamless changes in data structures without disrupting pipelines. It covers practical implementation using PySpark and highlights the benefits of agility, backward compatibility, and pipeline reliability.
Medallion Architecture organizes data into three distinct layers—Bronze, Silver, and Gold—enhancing data quality and usability as it progresses through the system. Originating from Databricks' Lakehouse vision, this design pattern emphasizes the importance of structured and unstructured data integration for effective decision-making.