TL;DR — Dynamic Resource Allocation is the new, GA-since-v1.34 way Kubernetes allocates specialized hardware. Instead of an opaque count (nvidia.com/gpu: 1), a pod files a ResourceClaim describing the device it needs — by model, memory, topology — and a driver matches it from published ResourceSlices. It brings fractional sharing, prioritized alternatives, and device health to core Kubernetes.
What it is
DRA is a framework in core Kubernetes for requesting and allocating specialized resources — GPUs, FPGAs, NICs — with rich constraints. It graduated to GA in Kubernetes v1.34 (Aug 2025), stable API resource.k8s.io/v1 on by default. In the AI Native landscape it's the future-facing piece of AI Native Infra › Accelerator and SuperPod — the model the others are converging on.
Why it exists
The old device-plugin model exposes hardware as a countable integer with no vocabulary for "an H100 with ≥40 GB, NVLink-connected to its peer." You can't express attributes, can't share fractionally in a first-class way, and can't ask for alternatives. DRA replaces counting with claiming: declarative, attribute-aware requests resolved by vendor drivers.
The objects
| Object | Role |
|---|---|
DeviceClass | A category of devices + how to select attributes (e.g. "NVIDIA GPU"). Cluster-scoped, admin-defined. |
ResourceClaim | A pod's request for device(s) matching constraints. The thing that gets allocated. |
ResourceClaimTemplate | Stamps out a per-pod ResourceClaim (like a PVC template). |
ResourceSlice | Driver-published inventory of the devices available in a pool — what the scheduler matches against. |
Fig 1 — A claim describes the need; the scheduler matches it against driver-published slices.
What v1.34 brought
- Consumable Capacity (beta) — first-class fractional sharing: allocate, say, 10 GiB of a 40 GiB GPU safely across pods/namespaces.
- Prioritized devices — list acceptable alternatives in order (one H100, else two mid GPUs); the scheduler tries them in turn.
- Device health status — a device's health surfaces in Pod status (for DRA and device-plugin devices), so failures are diagnosable.
- Vendor drivers — NVIDIA donated its DRA GPU driver and Google its TPU driver to the community.
Quick start
On v1.34+ the API is on by default; you install a vendor DRA driver, then a pod references a claim built from a DeviceClass:
# pod references a ResourceClaimTemplate; the driver + scheduler resolve it
spec:
resourceClaims:
- name: gpu
resourceClaimTemplateName: single-gpu
The ResourceClaimTemplate selects a DeviceClass (e.g. NVIDIA GPUs) and adds constraints; the driver publishes ResourceSlices the scheduler matches.
When to use, when to skip
Use it on modern clusters (v1.34+) where you need attribute-aware allocation, fractional sharing, topology, or multi-vendor accelerators — it's the strategic direction and what new tooling targets. NVIDIA/Google driver donations signal broad consensus.
Hold off if you're on older Kubernetes, your platform/cloud hasn't shipped DRA drivers yet, or the simple device plugin already meets your needs. Migration is gradual — the device plugin still works.
DRA vs the old way
| Model | Request looks like | Note |
|---|---|---|
| DRA | ResourceClaim with attributes/constraints | Rich, fractional, GA v1.34 |
| Device Plugin | Opaque count: nvidia.com/gpu: 1 | Simple, no attributes |
| HAMi | Resource fields for slices | Software sharing; integrates with DRA |
References
- DRA concepts — official docs.
- DRA graduates to GA (v1.34) — the GA announcement.
- Allocate devices with DRA — hands-on task.
Extra reads
- DRA on GKE — managed-cloud view.
- Introduction to DRA — concepts walkthrough.
- DRA in v1.33 — the run-up to GA.
Verified against kubernetes.io DRA docs, May 2026. GA as of v1.34 (resource.k8s.io/v1).