Welcome!
RSS FeedThis blog will contain articles about the Iceberg Data Lakehouse (using your data lake are your data warehouse with Apache Iceberg) and The Agentic Lakehouse (Lakehouses Optimized for working with AI Agents). What is Apache Iceberg? What is a Data Lakehouse? Where can I find more resources are all the types of content you'll find on this blog.
This blog is not affiliated with the Apache Foundation or the Apache Iceberg project whose official page is iceberg.apache.org.
Join the Data Lakehouse Hub Slack Community: Join Now!
Subscribe to our calendar of Data Lakehouse events: Subscribe!
Recent Posts
Data Engineering Best Practices: The Complete Checklist
Published: at 06:00 PMBest practices documents are easy to write and hard to use. They list principles without context, advice without prioritization, and rules without explaining...
Pipeline Observability: Know When Things Break
Published: at 05:00 PMAn analyst messages you on Slack: "The revenue numbers look wrong. Is the pipeline broken?" You check the orchestrator — all green. You check the target tabl...
Testing Data Pipelines: What to Validate and When
Published: at 04:00 PMAsk an application developer how they test their code and they'll describe unit tests, integration tests, CI/CD pipelines, and coverage metrics. Ask a data e...
Partition and Organize Data for Performance
Published: at 03:00 PMA table with 500 million rows takes 45 seconds to query. After partitioning it by date, the same query — filtering on a single day — returns in 2 seconds. Th...