// AI NATIVE STACK

AI Native › AI Native Infra › Network › Cilium

CRASH COURSE · AI-NATIVE · intermediate · 11 min read · CNCF Graduated

Cilium — eBPF-powered networking, security, and observability for Kubernetes.

network ai-native cilium ebpf kubernetes

TL;DR — Cilium is the CNCF Graduated CNI plugin that uses eBPF to provide networking, network policy, load balancing, and observability for Kubernetes — all in the kernel, no iptables. For AI workloads it matters because it handles the management-plane networking (pod-to-pod, pod-to-API, ingress) with high performance and fine-grained security, while RDMA/SR-IOV handles the data-plane GPU-to-GPU traffic.

What it is

Cilium is an open-source Kubernetes CNI (Container Network Interface) plugin built on eBPF. It replaces kube-proxy and iptables with eBPF programs that run directly in the Linux kernel, providing networking, L3/L4/L7 security policies, transparent encryption, service mesh, and deep observability (via Hubble). It's a CNCF Graduated project and the default CNI for GKE, EKS, and AKS. In the AI Native landscape it's in AI Native Infra › Network.

Why it exists

Traditional Kubernetes networking uses iptables — which becomes a bottleneck at scale and gives you almost no visibility into what's happening. Cilium moves networking logic into eBPF programs in the kernel: faster packet processing, identity-based security (not just IP-based), and deep per-flow observability without sidecar proxies. On AI clusters, where thousands of pods do parameter syncs and API calls, this efficiency and visibility matters.

podsTCP/HTTP traffic Cilium (eBPF)CNI · L3/L4/L7 policyload balancing · encryptionHubble observability services / ingressexternal world

Fig 1 — Cilium provides the management-plane networking for all pod traffic via eBPF in the kernel.

How it works

Cilium installs as a DaemonSet. On each node, it attaches eBPF programs to network interfaces and socket hooks. These programs handle routing, NAT, load balancing, and policy enforcement — all without leaving the kernel. Identities are assigned to pods based on labels (not just IPs), and policies reference those identities. Hubble, the observability layer, taps the same eBPF data path to give per-flow visibility.

Key features

  • eBPF datapath — replaces iptables and kube-proxy; faster at scale.
  • Identity-based security — network policies based on pod labels, namespaces, DNS, not just IPs.
  • L7 visibility — HTTP, gRPC, Kafka protocol-aware policies and observability.
  • Hubble — real-time network observability, flow logs, service dependency maps.
  • Transparent encryption — WireGuard or IPsec between nodes, zero config.
  • Gateway API — native support for Kubernetes Gateway API (ingress, egress).
  • Multi-cluster — ClusterMesh for cross-cluster pod connectivity and service discovery.

Quick start

Install via Helm on a Kubernetes cluster:

helm repo add cilium https://helm.cilium.io/
helm install cilium cilium/cilium \
  --namespace kube-system \
  --set kubeProxyReplacement=true \
  --set hubble.relay.enabled=true \
  --set hubble.ui.enabled=true

After install, all pod networking flows through Cilium's eBPF datapath. Use hubble observe to watch live traffic flows.

When to use, when to skip

Use it as the CNI for your AI cluster's management-plane networking — pod-to-pod, pod-to-service, API traffic, ingress. It handles the TCP/IP layer efficiently and gives you security policies and visibility. On GPU clusters, Cilium manages the control/management traffic while RDMA and SR-IOV handle the high-bandwidth GPU-to-GPU data plane.

Skip it only if your cluster already has a CNI you're happy with and don't need the eBPF benefits. Cilium doesn't replace RDMA/InfiniBand for GPU collective traffic — that's a different network entirely.

heads up Cilium requires a Linux kernel ≥ 4.19 (5.10+ recommended for full features). On managed clouds it's usually the default or one-click — on bare metal, verify kernel support.

vs / alongside

ToolRoleNote
CiliumeBPF CNI for K8s networkingManagement plane, CNCF Graduated
CalicoCNI with iptables or eBPFMature alternative, different approach
MultusMulti-NIC podsAdds secondary interfaces alongside Cilium
RDMAGPU-to-GPU data planeDifferent layer entirely

References

Extra reads

Verified against Cilium docs (docs.cilium.io), May 2026.

← AI Native Stack
© cvam — written in plaintext, served warm