Delta Lake Protocol
Summary
What it is
The specification for ACID transaction logs over Parquet files on object storage. Defines how writes, deletes, and schema changes are recorded in a JSON-based commit log stored alongside data files.
Where it fits
The Delta protocol is what makes Delta Lake tables transactional. The commit log serializes changes so concurrent readers and writers see consistent state — even on S3, where atomic rename is unavailable.
Misconceptions / Traps
- The Delta protocol requires either atomic rename or an external coordination mechanism (DynamoDB, Azure ADLS). On S3, multi-cluster writes are unsafe without a log store.
- Protocol versions (reader/writer features) must be managed carefully. Upgrading to a newer protocol version may make older readers unable to open the table.
Key Connections
enablesLakehouse Architecture — the spec that makes Delta Lake ACID possiblesolvesSchema Evolution — schema enforcement in the transaction log- Delta Lake
depends_onDelta Lake Protocol scoped_toTable Formats, Lakehouse
Definition
What it is
A specification for ACID transaction logs over Parquet files on object storage. Defines how writes, deletes, and schema changes are recorded in a JSON-based transaction log stored alongside data files.
Why it exists
To bring database-like reliability to data lakes. The Delta protocol ensures that concurrent readers and writers see consistent table state, even on eventually consistent storage, by serializing changes through a commit log.
Primary use cases
ACID-compliant data lake tables, streaming + batch unification on S3, audit-trail via transaction log history.
Relationships
Outbound Relationships
scoped_toenablessolvesInbound Relationships
depends_onResources
The formal Delta Lake transaction log protocol specification defining the JSON action format, commit rules, checkpointing, schema enforcement, and time travel semantics.
Official Delta Lake documentation covering usage with Spark, API reference, table utilities, and operational guidance.
Canonical open-source repository for the Delta Lake project, containing the reference Spark-based implementation.
Delta Kernel in Rust provides a standalone, engine-agnostic implementation of the Delta protocol, important for the multi-engine ecosystem.