// AI NATIVE STACK

AI Native › AI Native Infra › Storage › JuiceFS

CRASH COURSE · AI-NATIVE · intermediate · 10 min read · CNCF

JuiceFS — a POSIX filesystem built on object storage for AI workloads.

storage ai-native juicefs kubernetes distributed-storage

TL;DR — JuiceFS is a POSIX-compatible distributed filesystem that stores data in any object store (S3, GCS, MinIO) and metadata in a transactional database (Redis, PostgreSQL, TiKV). It gives AI training jobs the random-read performance of local NVMe via aggressive multi-level caching, while keeping the economics and capacity of cloud object storage. CNCF Sandbox project with a Kubernetes CSI driver.

What it is

JuiceFS is an open-source, cloud-native distributed filesystem designed for large-scale data workloads. It separates metadata (stored in engines like Redis, PostgreSQL, or TiKV) from data (stored in object storage like S3 or MinIO). It's a CNCF Sandbox project. In the AI Native landscape it lives in AI Native Infra › Storage.

Why it exists

AI training reads billions of small files (images, tokens, shards) with random access patterns. Object storage is cheap but slow for this. Local NVMe is fast but not shared. JuiceFS bridges the gap: data lives in object storage (cheap, infinite), but a local cache on each node's SSD serves hot reads at NVMe speed — and the POSIX interface means frameworks like PyTorch read it like a local directory.

training podsPOSIX mount JuiceFS clientlocal SSD cache metadata (Redis) data (S3/MinIO)

Fig 1 — Pods mount JuiceFS like local storage; the client caches hot data on SSD and stores cold data in object storage.

How it works

The JuiceFS client runs as a FUSE mount or CSI driver. On reads, it checks the local SSD cache first — cache hits bypass the network entirely. On cache miss, it fetches from object storage and populates the cache. Metadata operations (ls, stat, open) go to the metadata engine, which is a fast transactional store. Writes are buffered locally, then flushed to object storage asynchronously.

Key features

  • POSIX compatible — works with PyTorch DataLoader, HuggingFace datasets, any tool expecting a filesystem.
  • Multi-level caching — kernel page cache → local SSD → object storage. Hot data stays on NVMe.
  • Elastic capacity — data lives in object storage, so capacity scales to petabytes without provisioning.
  • Kubernetes CSI driver — mount JuiceFS as a PV in pods, with cache lifecycle managed per node.
  • Multiple metadata engines — Redis (fastest), PostgreSQL, MySQL, TiKV (scalable), SQLite (single-node).
  • Strong consistency — close-to-open consistency by default, immediate consistency optional.

Quick start

Format a filesystem, then mount it:

# format — metadata in Redis, data in S3
juicefs format \
  --storage s3 \
  --bucket https://my-bucket.s3.amazonaws.com \
  redis://localhost:6379/1 \
  mydata

# mount
juicefs mount redis://localhost:6379/1 /mnt/jfs --cache-dir /ssd/jfs-cache

On Kubernetes, install the CSI driver via Helm, then create a PV/PVC referencing the JuiceFS volume — pods mount it transparently.

When to use, when to skip

Use it when training jobs need fast, shared access to large datasets stored in object storage — especially multi-node distributed training where every GPU worker needs the same data. The caching layer eliminates the object-storage latency tax.

Skip it for pure streaming workloads (video, large sequential reads) where object storage throughput is already sufficient, or when your dataset fits on a single node's local disk. Also overkill for serving/inference where data access is minimal.

heads up Cache sizing matters — if your working set exceeds local SSD capacity, you'll fall back to object storage latency. Plan cache disks to hold at least the hot dataset per epoch.

vs / alongside

ToolApproachNote
JuiceFSPOSIX FS on object storage, SSD cachingBest for random-read AI workloads
AlluxioData orchestration / caching layerJava-based, broader Hadoop ecosystem
MinIOS3-compatible object storageThe data store, not a filesystem
CubeFSDistributed FS with S3 and POSIXCNCF, more ops overhead

References

Extra reads

Verified against JuiceFS docs (juicefs.com), May 2026.

← AI Native Stack
© cvam — written in plaintext, served warm