Medallion Architecture
Summary
What it is
A layered data quality pattern — Bronze (raw), Silver (cleansed), Gold (business-ready) — with each layer stored on object storage.
Where it fits
Medallion is the most widely adopted data quality pattern within lakehouses. It organizes S3 data into progressive quality tiers, giving each tier a clear contract and making it safe for different consumers to read at different quality levels.
Misconceptions / Traps
- Three layers is a convention, not a rule. Some organizations use two layers; others add more. The pattern is about progressive refinement, not a fixed number of tiers.
- Medallion does not solve the small files problem — it can worsen it. Each layer transformation may produce many small output files, especially with streaming Silver→Gold pipelines.
Key Connections
is_aLakehouse Architecture — a specialization of the lakehouse patternconstrained_byLegacy Ingestion Bottlenecks, Small Files Problem- AWS S3
used_byMedallion Architecture — each layer resides on S3 - Apache Spark, Apache Flink
used_byMedallion Architecture — compute engines for tier transformations scoped_toLakehouse, Data Lake
Definition
What it is
A layered data quality pattern that organizes data into three tiers — Bronze (raw), Silver (cleansed/conformed), Gold (aggregated/business-ready) — with each layer stored on object storage.
Why it exists
Raw data arriving in S3 is messy, inconsistent, and not query-ready. The Medallion pattern provides a structured progression from raw ingestion to business-quality data, with clear contracts at each tier.
Primary use cases
Data lake quality management, incremental data refinement, separation of raw ingestion from analytics-ready data.
Relationships
Outbound Relationships
constrained_byInbound Relationships
Resources
Databricks' official glossary definition of the bronze/silver/gold data quality layering pattern they popularized.
Microsoft's Azure Databricks documentation explaining the medallion lakehouse architecture and its multi-hop data refinement approach.
Microsoft Fabric's official guidance on implementing medallion architecture in OneLake, showing cross-platform adoption of the pattern.