// AI NATIVE STACK

AI Native › AI Native Infra › Storage › MinIO

CRASH COURSE · AI-NATIVE · beginner · 9 min read · RELEASE.2025

MinIO — the S3-compatible object store you run yourself.

storage ai-native minio object-storage kubernetes

TL;DR — MinIO is a high-performance, S3-compatible object store you deploy on your own hardware or cloud VMs. Every tool that speaks S3 — PyTorch checkpoints, model registries, data lakes, JuiceFS, Alluxio — works with MinIO out of the box. Single binary, distributed mode for production, Kubernetes operator for cloud-native deployment.

What it is

MinIO is an open-source, S3-compatible object storage server written in Go. It's the most widely-deployed private object store in the industry, used as the storage backend for AI/ML pipelines, data lakes, and model checkpoints wherever you can't or don't want to use a cloud provider's S3. In the AI Native landscape it sits in AI Native Infra › Storage.

Why it exists

The AI stack assumes S3 everywhere — checkpointing, dataset storage, model artifacts, experiment tracking. But S3 means AWS lock-in and egress costs. MinIO gives you a drop-in S3 on your own infra: on-prem GPU clusters, bare metal, edge, or a second cloud — same API, same tooling, zero egress fees, full control.

training jobs model registry JuiceFS / Alluxio MinIOS3-compatible NVMe / HDDerasure-coded

Fig 1 — Everything that speaks S3 talks to MinIO; MinIO erasure-codes across local disks.

How it works

MinIO runs as a single binary. In distributed mode, it spans multiple nodes and drives, using erasure coding to protect data (configurable parity — lose drives or nodes without data loss). It implements the full S3 API: PutObject, GetObject, multipart uploads, versioning, lifecycle rules, bucket notifications — so any S3 SDK or CLI works unchanged.

Key features

  • Full S3 API — drop-in replacement; works with aws s3 CLI, Boto3, every ML framework.
  • High performance — designed for NVMe, benchmarks at 300+ GiB/s on commodity hardware.
  • Erasure coding — data protection without RAID; configurable parity for performance vs. durability.
  • Kubernetes Operator — declarative tenants, automatic TLS, scaling, upgrades.
  • Encryption & IAM — server-side encryption, bucket policies, OpenID Connect integration.
  • Replication — site-to-site replication for DR and multi-site deployments.

Quick start

Run a single-node instance for dev, or use the Kubernetes Operator for production:

# single-node dev
minio server /data --console-address ":9001"
# access at http://localhost:9000  (API)  http://localhost:9001  (console)

# Kubernetes — install operator, then create a Tenant
kubectl apply -k github.com/minio/operator

Point your training script's S3 endpoint to MinIO's address and use standard S3 credentials — done.

When to use, when to skip

Use it when you need S3-compatible storage on your own infrastructure — on-prem GPU clusters, bare metal, air-gapped environments, or anywhere you want to avoid cloud egress costs. It's the default choice for self-hosted object storage in the AI stack.

Skip it if you're fully on a cloud provider and happy with their S3/GCS/Blob — running your own object store adds ops overhead. Also unnecessary if your data volumes are small enough to fit on local disk.

heads up MinIO is AGPLv3 licensed. The enterprise console and some advanced features require a commercial license. Check licensing if you're embedding it in a commercial product.

vs / alongside

ToolRoleNote
MinIOS3-compatible object storeThe storage backend
AWS S3 / GCSCloud object storageNo ops, vendor lock-in
JuiceFSPOSIX FS on top of object storageUses MinIO as backend
CubeFSDistributed FS with S3 + POSIXDifferent architecture

References

Extra reads

Verified against MinIO docs (min.io), May 2026.

← AI Native Stack
© cvam — written in plaintext, served warm