Guide 3

Why Iceberg Exists (and What It Replaces)

Problem Framing

Before Iceberg, querying data on S3 meant pointing a Hive Metastore at a directory of Parquet files and hoping for the best. There were no transactions, schema changes required rewriting data, partition layouts were user-visible and fragile, and concurrent reads/writes produced unpredictable results. Iceberg replaces this entire stack of workarounds with a formal table specification.

Relevant Nodes

  • Topics: Table Formats, Lakehouse, S3
  • Technologies: Apache Iceberg, Delta Lake, Apache Hudi, Apache Spark, Trino, DuckDB, Apache Flink
  • Standards: Iceberg Table Spec, Apache Parquet, S3 API
  • Architectures: Lakehouse Architecture
  • Pain Points: Schema Evolution, Small Files Problem, Partition Pruning Complexity, Metadata Overhead at Scale, Lack of Atomic Rename

Decision Path

  1. Understand what Iceberg replaces:

    • Hive-style partitioning → Iceberg's hidden partitioning. Users no longer need to specify partition columns in queries; the table format handles pruning transparently.
    • Schema rigidity → Iceberg's column-ID-based schema evolution. Add, drop, rename, and reorder columns as metadata-only operations. No data rewrite required.
    • No transactions → Iceberg's snapshot isolation. Writers produce new snapshots; readers see consistent table state. Concurrent access is safe.
    • Directory listing for file discovery → Iceberg's manifest files. Query planners read manifests instead of listing S3 prefixes — eliminating the object listing bottleneck.
  2. Decide if Iceberg is right for your workload:

    • Yes if you need multi-engine access (Spark, Trino, Flink, DuckDB all reading the same tables).
    • Yes if schema evolution is frequent and you cannot afford data rewrites.
    • Yes if you want vendor-neutral table format with the broadest ecosystem support.
    • Consider alternatives if you are deeply invested in Databricks (Delta Lake has tighter integration) or need CDC-first ingestion patterns (Hudi specializes here).
  3. Understand Iceberg's S3 constraints:

    • Iceberg metadata is stored as files on S3. Metadata operations (commit, planning) are subject to S3 latency.
    • Atomic commits on S3 require a catalog (Hive Metastore, Nessie, AWS Glue) to coordinate metadata pointer updates.
    • Metadata grows with every commit. Snapshot expiration and orphan file cleanup are operational necessities.
  4. Plan for metadata maintenance from day one:

    • Expire old snapshots regularly (expireSnapshots)
    • Remove orphan files that are no longer referenced
    • Compact manifests when manifest lists grow large
    • Monitor metadata file counts and planning times

What Changed Over Time

  • Iceberg started at Netflix (2018) to solve table management problems at Netflix's scale on S3.
  • Graduated to Apache Top-Level Project (2020), signaling broad industry adoption.
  • Multi-engine support expanded — from Spark-only to Spark, Trino, Flink, DuckDB, ClickHouse, StarRocks.
  • Iceberg REST catalog emerged as a standard catalog interface, reducing lock-in to specific metadata stores.
  • Databricks began supporting Iceberg alongside Delta, effectively acknowledging Iceberg's momentum as the cross-engine standard.

Sources