// AI NATIVE STACK

AI Native › AI Native Infra › Network › SR-IOV

CRASH COURSE · AI-NATIVE · advanced · 10 min read · k8s plugin

SR-IOV — hardware-virtualized NICs that give pods bare-metal network speed.

network ai-native sr-iov rdma kubernetes

TL;DR — SR-IOV (Single Root I/O Virtualization) splits one physical NIC into multiple Virtual Functions (VFs), each passthrough-attached to a pod. The pod gets near-bare-metal network performance — no software switch overhead — which is critical for RDMA-based GPU collective communication. On Kubernetes, the SR-IOV Network Operator + device plugin + CNI automate VF creation, discovery, and pod attachment.

What it is

SR-IOV is a PCI specification that lets a single physical network adapter present itself as multiple virtual adapters (VFs). Each VF can be passed directly into a container or VM, bypassing the host's software network stack. The SR-IOV Network Operator for Kubernetes automates the entire lifecycle: enabling SR-IOV on NICs, creating VFs, advertising them as schedulable resources, and attaching them to pods via Multus. In the AI Native landscape it's in AI Native Infra › Network.

Why it exists

GPU training with NCCL all-reduce needs maximum network bandwidth and minimum latency between nodes. Standard pod networking (veth pairs, bridge, overlay) adds microseconds of latency and caps throughput. SR-IOV VFs bypass all of that — the NIC DMA's packets directly into the pod's memory. Combined with RDMA, this is how GPU clusters achieve 200–400 Gbps inter-node communication for distributed training.

pod A pod B VF 0 VF 1 Physical NIC (PF)ConnectX-7 / E810200/400 Gbps

Fig 1 — Each pod gets a VF passthrough-attached; traffic bypasses the host kernel stack.

How it works

The SR-IOV Network Operator runs on each node and: (1) configures the PF (Physical Function) to create VFs, (2) runs the SR-IOV device plugin to advertise VFs as extended resources (e.g., intel.com/sriov_rdma), and (3) provides the SR-IOV CNI plugin that Multus invokes to plumb a VF into the pod's network namespace. Pods request VFs like GPUs — as a resource limit — and get a bare-metal NIC.

Key features

  • Near-bare-metal performance — VF passthrough bypasses veth, bridge, overlay. Direct DMA.
  • RDMA capable — VFs support RoCE/InfiniBand for GPU collective ops.
  • Operator-managed — SR-IOV Network Operator automates VF lifecycle, driver binding, and config.
  • Schedulable resource — Kubernetes scheduler places pods on nodes with available VFs.
  • Works with Multus — secondary interface alongside primary CNI.

Quick start

Install the SR-IOV Network Operator, define a policy, and request VFs in pods:

# install operator (via OLM or Helm)
kubectl apply -f https://github.com/k8snetworkplumbingwg/sriov-network-operator/releases/latest/download/sriov-network-operator.yaml
# SriovNetworkNodePolicy — create 8 VFs on ens3f0
apiVersion: sriovnetwork.openshift.io/v1
kind: SriovNetworkNodePolicy
metadata:
  name: rdma-policy
spec:
  nodeSelector:
    feature.node.kubernetes.io/network-sriov.capable: "true"
  numVfs: 8
  nicSelector:
    pfNames: ["ens3f0"]
  deviceType: netdevice
  isRdma: true
  resourceName: sriov_rdma
# pod spec — request a VF
resources:
  limits:
    openshift.io/sriov_rdma: "1"
metadata:
  annotations:
    k8s.v1.cni.cncf.io/networks: sriov-rdma-net

When to use, when to skip

Use it on bare-metal or VM-passthrough GPU clusters where NCCL/RDMA performance is critical — distributed training at scale. SR-IOV + RDMA is the standard network stack for multi-node GPU training on Kubernetes.

Skip it for inference-only clusters, single-node training, or cloud-managed Kubernetes where the provider handles GPU networking. Also unnecessary if your NIC doesn't support SR-IOV.

heads up SR-IOV requires hardware support (NIC + BIOS/IOMMU enabled). VFs are finite — plan your VF count per PF to match the number of pods you'll schedule per node.

vs / alongside

ToolRoleNote
SR-IOVHardware NIC virtualizationVF passthrough for bare-metal speed
macvlan/ipvlanSoftware sub-interfacesSimpler, less performance
MultusMulti-NIC orchestratorUses SR-IOV CNI as a secondary plugin
RDMAZero-copy protocolRuns over SR-IOV VFs

References

Extra reads

Verified against SR-IOV Network Operator docs (github.com/k8snetworkplumbingwg), May 2026.

← AI Native Stack
© cvam — written in plaintext, served warm