Topic

Object Storage

Summary

What it is

The storage paradigm of flat-namespace, HTTP-accessible binary objects with metadata. Data is addressed by bucket and key, not by filesystem path.

Where it fits

Object storage is the foundational layer beneath everything in this index. S3 is the dominant API; all technologies, table formats, and architectures in the map operate on top of object storage.

Misconceptions / Traps

  • Object storage has no native directory hierarchy. Prefixes simulate folders but LIST operations scan linearly — not like ls on a filesystem.
  • Durability (11 9s) is not the same as availability or performance. Data is safe but access can be slow or throttled.

Key Connections

  • scoped_to S3 — S3 is the dominant object storage API
  • Lakehouse scoped_to Object Storage — lakehouses are built on object storage
  • AWS S3, MinIO, Ceph, Apache Ozone scoped_to Object Storage — concrete implementations
  • Separation of Storage and Compute scoped_to Object Storage — the pattern that decouples compute from data

Definition

What it is

The storage paradigm of flat-namespace, HTTP-accessible binary objects with metadata. Data is addressed by bucket and key, not by filesystem path.

Why it exists

Traditional filesystems and block storage do not scale to billions of objects across distributed infrastructure. Object storage trades POSIX semantics for horizontal scalability, durability, and HTTP accessibility.

Relationships

Outbound Relationships

scoped_to

Resources