LLMS3 .com
Graph Browse Guides
Browse All
Topic 9
  • S3
  • Object Storage
  • Lakehouse
  • Data Lake
  • Table Formats
  • Vector Indexing on Object Storage
  • LLM-Assisted Data Systems
  • Metadata Management
  • Data Versioning
Technology 14
  • AWS S3
  • MinIO
  • Ceph
  • Apache Ozone
  • Apache Iceberg
  • Delta Lake
  • Apache Hudi
  • DuckDB
  • Trino
  • ClickHouse
  • Apache Spark
  • LanceDB
  • StarRocks
  • Apache Flink
Standard 8
  • S3 API
  • Apache Parquet
  • Apache Arrow
  • Iceberg Table Spec
  • Delta Lake Protocol
  • Apache Hudi Spec
  • ORC
  • Apache Avro
Architecture 8
  • Lakehouse Architecture
  • Medallion Architecture
  • Separation of Storage and Compute
  • Hybrid S3 + Vector Index
  • Offline Embedding Pipeline
  • Local Inference Stack
  • Write-Audit-Publish
  • Tiered Storage
Pain Point 12
  • Small Files Problem
  • Cold Scan Latency
  • Schema Evolution
  • Legacy Ingestion Bottlenecks
  • High Cloud Inference Cost
  • Object Listing Performance
  • Metadata Overhead at Scale
  • Partition Pruning Complexity
  • Vendor Lock-In
  • Egress Cost
  • S3 Consistency Model Variance
  • Lack of Atomic Rename
Model Class 4
  • Embedding Model
  • General-Purpose LLM
  • Code-Focused LLM
  • Small / Distilled Model
LLM Capability 6
  • Embedding Generation
  • Semantic Search
  • Metadata Extraction
  • Schema Inference
  • Data Classification
  • Natural Language Querying
Guides (8)
LLM Capability

LLM Capability

Specific functions performed by models, scoped to operations on S3-stored data.

6 nodes

Embedding Generation

LLM Capability

Converting unstructured content stored in S3 (documents, images, logs) into vector representations for similarity search.

7 connections 2 resources

Semantic Search

LLM Capability

Querying S3-derived vector embeddings to find content by meaning rather than exact keyword match.

6 connections 3 resources

Metadata Extraction

LLM Capability

Using LLMs to extract structured metadata (entities, categories, summaries, key-value pairs) from unstructured objects stored in S3.

6 connections 3 resources

Schema Inference

LLM Capability

Using LLMs to infer or suggest schemas from semi-structured data (JSON, CSV, nested formats) stored in S3.

7 connections 3 resources

Data Classification

LLM Capability

Using LLMs to categorize, tag, or label S3-stored objects based on content analysis — by topic, sensitivity level, or compliance category.

6 connections 2 resources

Natural Language Querying

LLM Capability

Using LLMs to translate natural language questions into executable queries (SQL, API calls) over S3-backed datasets.

7 connections 3 resources
Esc
LLMS3.com — The S3 & Object Storage Ecosystem Index
About Privacy llms.txt llms-full.txt