TL;DR — YuniKorn is an Apache scheduler that replaces the default Kubernetes scheduler with one built for multi-tenant batch + service workloads. Its signature is hierarchical queues with elastic quotas and resource fairness, plus gang scheduling and preemption. Heritage in Big Data (Spark, Flink) makes it strong wherever data + ML share one cluster.
What it is
Apache YuniKorn is a universal, Kubernetes-native resource scheduler for big-data and ML workloads. Unlike add-ons that sit beside the default scheduler, YuniKorn is designed to be the scheduler — handling both long-running services and batch jobs through one policy engine. In the AI Native landscape it's in AI Native Infra › Orchestration and Scheduling.
Why it exists
The default scheduler has no concept of queues, tenants, or fairness — it places pods, full stop. Organizations running Spark, Flink, and ML training on shared clusters need org-chart-shaped resource control: department → team → project, each with guaranteed and burstable quota. YuniKorn brings the YARN-style hierarchical-queue model that Big Data teams expect, to Kubernetes.
How it works
YuniKorn is application-aware: it groups pods into apps and schedules apps against a queue hierarchy. Each queue has min (guaranteed) and max (cap) resources; unused capacity flows elastically to siblings, then is reclaimed via preemption when owners need it. It can run as the primary scheduler for all pods.
Fig 1 — Hierarchical queues: guaranteed + elastic quota cascading down the org tree.
Key features
- Hierarchical queues — nested queues with guaranteed/max quotas and elastic borrowing.
- Resource fairness — fair sharing across tenants and queues.
- Job ordering — FIFO or FAIR ordering within queues.
- Gang scheduling — schedule an app only when its minimum members fit; reduces fragmentation and can trigger proactive scale-up.
- Priority & preemption — higher-priority work reclaims resources.
- Universal — one scheduler for batch and services; pluggable node-sorting policies (e.g. bin-pack vs spread).
Quick start
Install via Helm; it can take over scheduling for the cluster (or just for pods you target):
helm repo add yunikorn https://apache.github.io/yunikorn-release
helm install yunikorn yunikorn/yunikorn -n yunikorn --create-namespace
kubectl get pods -n yunikorn # scheduler + admission controller
Queues and placement rules live in YuniKorn's config (a ConfigMap); apps land in a queue by annotation or placement rule, e.g. yunikorn.apache.org/queue: root.team-ml.training.
When to use, when to skip
Use it when you run mixed Big-Data + ML on shared Kubernetes and want YARN-style hierarchical fairness as the cluster's primary scheduler — especially Spark/Flink-heavy shops (it's common on EMR-on-EKS). One scheduler covering both batch and services is its big selling point.
Skip it if you only need gang scheduling for training (Volcano is more AI/HPC-focused) or just job queueing/quota layered on the default scheduler (Kueue is lighter and doesn't replace the scheduler).
vs the alternatives
| Tool | Best for | Trade-off |
|---|---|---|
| YuniKorn | Hierarchical queues, batch+service in one scheduler, Big Data | Replaces the scheduler; queue-first model |
| Volcano | AI/HPC gang scheduling + rich placement | More AI-specific, less queue hierarchy |
| Kueue | Quota/queueing on the default scheduler | Admission only |
References
- Apache YuniKorn — project site + docs.
- Gang scheduling — the model in detail.
- apache/yunikorn-core — source.
Extra reads
- Batch scheduling compared — YuniKorn vs Volcano vs Kueue.
- Spark on K8s with YuniKorn — gang scheduling in practice.
- YuniKorn for EMR on EKS — a production deployment.
Verified against the official Apache YuniKorn docs (yunikorn.apache.org), May 2026.