Pain Point

Object Listing Performance

Summary

What it is

The slowness and cost of listing large numbers of objects in S3's flat namespace using prefix-based scans. Paginated at 1,000 objects per request.

Where it fits

Object listing is the hidden bottleneck in S3 operations. Partition discovery, garbage collection, and table snapshots all start with listing — and at millions of objects, LIST calls dominate job startup time. Originates from: **S3 API**.

Misconceptions / Traps

S3 prefixes are not directories. A prefix scan does not benefit from directory-like structure — it is a linear scan filtered server-side.
S3 Inventory (an offline listing report) is often better than real-time LIST for large-scale enumeration. But Inventory has a 24-48 hour delay.

Key Connections

AWS S3 constrained_by Object Listing Performance — inherent API limitation
DuckDB, Trino constrained_by Object Listing Performance — query engines pay the listing cost
Table formats reduce listing dependency by maintaining manifests, but metadata itself must be listed
scoped_to S3, Object Storage

Definition

What it is

The slowness and cost of listing large numbers of objects in S3's flat namespace using prefix-based scans.

Relationships

Outbound Relationships

scoped_to

S3 Object Storage

Inbound Relationships

constrained_by

AWS S3 DuckDB Trino

Resources

DocsHigh

docs.aws.amazon.com/AmazonS3/latest/API/API_ListObjectsV2.ht...

Official AWS API reference for ListObjectsV2, documenting the 1,000-object-per-request limit and pagination mechanisms that constrain listing performance.

DocsHigh

docs.aws.amazon.com/AmazonS3/latest/userguide/optimizing-per...

AWS's official performance design patterns covering S3 Inventory as an alternative to listing, prefix parallelization, and caching strategies for large-scale object enumeration.

BlogMedium

xuanwo.io/2025/02-why-s3-list-objects-taking-120s-to-respond...

Deep engineering investigation into why S3 ListObjects can take 120+ seconds, revealing how delete markers and versioning cause severe performance degradation.