Deep dive 5 of the KubeCon Mumbai 2026 series. Guna K Kambalimath and Kishen V (IBM) went under the hood of Kubernetes operators to answer a question most operator authors never consciously decide: which resources should your controller actually watch and cache — and which should it deliberately ignore? Watch too much and you get ballooning memory, goroutine leaks, log storms, and dangerous over-privilege. The talk is a masterclass in the controller-runtime cache, field/label selectors, predicates, MaxConcurrentReconciles, and shift-left security with Validating Admission Policies — all illustrated with benchmarks and a running "AI log analyzer" operator.
If deep dive 04 showed the friendly abstractions developers see, this one is what's underneath: the operators that implement those abstractions, and the very real performance and security cliffs you fall off if you build them naively. It's the most code-heavy talk in the platform cluster, and the most quietly important — these are the bugs that take down operators in production.
The analogy — every controller is a reconcile loop
The grounding analogy was the built-in Deployment controller: it watches Deployments and the Pods/ReplicaSets they own, receives events (e.g. "pod deleted"), and runs a reconcile loop — does the desired state hold (are there n replicas)? If yes, do nothing; if no, create the pods needed. Every operator is a variation on this loop. The design question the talk drills into is the input side: what feeds events into that loop, and at what cost?
The running example — an AI log analyzer operator
The talk used a custom AILogAnalyzer operator throughout. Its anatomy maps to the three categories of resource every operator deals with:
- Primary resource (the CR):
AILogAnalyzer— the custom resource the operator owns, specifying desired state (scan_interval: 15m,alert_channel,model), backed by a CRD. - Watched secondary resources: things it observes but doesn't own — e.g. a
security-backendDeployment (labelledapp: security) whose logs it analyzes. - Owned secondary resources: things it creates and controls via
ownerReferences— e.g. asecurity-analyzer-engineDeployment it spins up.
The reconcile logic: get the pods of a watched deployment (from cache), pull the last 15 minutes of logs, call an AI endpoint, return a critical incident if any, trigger alerts, and update the CR's status on failure. Simple enough — until you ask what it costs to watch all those resources.
Zooming into the cache — and the four ways to narrow the watch
The four levers to narrow the watch:
- Field and Label Selectors — only cache objects matching specific fields (e.g.
status.phase=Running) or labels (e.g.app: security). - Namespace scoping — only watch the namespaces you care about, not the whole cluster.
- Sync Period — how often the cache does a full resync (a periodic re-reconcile of everything).
- Predicates — code-level filters that drop events before they reach the work queue.
Fig 1 — events flow API → selector → cache → predicate → work queue → reconciler. Filter early, filter cheap.
"Don't cache the noise" — selectors in code
The first concrete technique: scope the manager's cache with label and field selectors so unrelated objects never enter memory at all.
requirement, _ := labels.NewRequirement("app", selection.In,
[]string{"IAM", "security-backend", "log-analyzer"})
labelSelector := labels.NewSelector().Add(*requirement)
mgr, err := ctrl.NewManager(ctrl.GetConfigOrDie(), ctrl.Options{
Cache: cache.Options{
DefaultLabelSelector: labelSelector,
DefaultFieldSelector: fields.SelectorFromSet(fields.Set{
"metadata.namespace": "iam-security",
}),
ByObject: map[client.Object]cache.ByObject{
&corev1.Pod{}: { // per-resource override
Field: fields.SelectorFromSet(fields.Set{"status.phase": "Running"}),
},
},
SyncPeriod: ptr.To(20 * time.Second),
},
})
And a predicate to drop events the selector can't express, before they reach the reconciler:
func ignoreSpecificLabels() predicate.Predicate {
return predicate.NewPredicateFuncs(func(object client.Object) bool {
labels := object.GetLabels()
if labels == nil { return false } // no labels → don't reconcile
if appLabel, ok := labels["app"]; ok {
if appLabel == "IAM" || appLabel == "security-backend" {
return false // block these
}
}
return true // allow everything else
})
}
The three payoffs the talk named:
- Zero-waste memory — unready/pending metadata never enters the cache.
- Reduction in API noise — intermediate lifecycle chatter is muted, so the operator reconciles only when it truly needs to.
- Protection from log storming — the work queue is insulated from application churn and the endless retry loops of crashing/unavailable workloads.
The benchmark — what ballooning memory actually looks like
This was the slide that justified all the fiddly selector code. Benchmarking an operator watching 500 labeled vs 500 unlabeled resources while scaling StatefulSets to 1,000:
| Metric | Without selector | With selector |
|---|---|---|
| Heap object count (GOGC=off, 1000 STS) | ~900,000+ | ~640,000 |
| Heap memory in use (GOGC=off, 1000 STS) | ~118 MB | ~79 MB |
More watches, longer queues — and the goroutine bill
Memory isn't the only cost. Each watched resource type spins up machinery, and the talk measured the goroutine growth precisely:
| What you watch | Goroutines |
|---|---|
Baseline (For(&CustomApp{})) | 83 |
+ Watches(Deployment) | 93 (+10) |
+ Watches(StatefulSet) | 103 (+10) |
+ Watches(PersistentVolumeClaim) | 123 (+20) |
And concurrency multiplies it — MaxConcurrentReconciles adds worker goroutines per controller. The talk's point: tune concurrency by resource criticality, not uniformly. Don't use a one-size-fits-all queue; give a critical resource more concurrent reconciles and a noisy-but-unimportant one fewer.
The goroutine leak that scales linearly with CRs
The scariest slide: a leak caused by creating a client connection inside the reconcile loop without closing it.
func (r *CustomReconciler) Reconcile(ctx, req) (ctrl.Result, error) {
var cr myapi.CustomApp
if err := r.Get(ctx, req.NamespacedName, &cr); err != nil { ... }
client, err := externalapi.NewClient(cr.Spec.Endpoint) // new conn each reconcile
if err != nil { return ctrl.Result{}, err }
// defer client.Close() ← MISSING: goroutines pile up forever
...
}
The measured effect: goroutines climbing 83 → 239 → 395 → 551 as CRs increased — an unbounded leak. The fix (one-time client init, reused across reconciles, or a proper defer client.Close()) flattened it to 83 → 98 → 98 → 98 — a one-time initialization cost independent of CR volume.
Reconcile. The reconcile loop runs constantly; anything you allocate without releasing leaks at the rate of your event stream. Either reuse a long-lived client or defer Close() religiously. This one mistake quietly takes down more operators than any algorithm.Security — over-privileged controllers and shift-left enforcement
The final act connected watching to security. The failure mode: an over-privileged controller. The watch logic intends to touch only app: security-backend Deployments, but the RBAC Role grants get/watch/list/update on all Deployments. Add a reconcile-loop bug that ignores labels internally, and the controller can modify a Deployment it was never supposed to — e.g. accidentally mangling the payments service.
The fix is defense in depth, and it's shift-left — enforce at admission, not in fragile controller code. Two options:
A validating webhook
A webhook that intercepts Deployment updates, checks the target label and ownerReferences, and enforces that only the operator's service account can modify operator-owned Deployments — denying manual updates with "Strictly managed by the Operator."
A Validating Admission Policy (VAP) — no webhook server needed
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionPolicy
metadata: { name: "operator-only-deployment-updates" }
spec:
failurePolicy: Fail
matchConstraints:
resourceRules:
- apiGroups: ["apps"]
apiVersions: ["v1"]
operations: ["UPDATE"]
resources: ["deployments"]
validations:
- expression: >
!(object.metadata.labels['app'] == 'log-analyzer' &&
object.metadata.ownerReferences.exists(owner, owner.kind == 'LogAnalyzerOperator')) ||
request.userInfo.username == 'system:serviceaccount:log-analyzer-system:log-analyzer-operator'
message: "This deployment is strictly managed by the Operator. Manual updates are forbidden."
ValidatingAdmissionPolicy runs the CEL expression in-process in the API server — no webhook server to deploy, scale, secure, or keep available. For policy this simple ("only the operator's SA may update its own Deployments"), VAP is the lighter, more reliable choice. This is the same admission-control theme the Kyverno deep dive builds on next.Key takeaways — the configurable options
- Memory optimization: cache only the resources you care about — label/field selectors, namespace scoping, per-object overrides.
- Resource-vs-latency balancing: don't use a one-size-fits-all queue; set
MaxConcurrentReconcilesstrategically by resource criticality. - Event handling: protect the work queue from noise — custom predicates that intercept and drop unnecessary events before they reach the reconciler.
- Shift-left security: use native Validating Admission Policies to catch and reject updates from non-owners, rather than relying on (fallible) controller code.
- Don't leak in Reconcile: never open an unclosed connection inside the loop — reuse or
defer Close().
FAQ
What's the difference between For, Owns, and Watches?
For declares the primary resource the controller reconciles. Owns watches resources the controller created (via ownerReferences) and maps their events back to the owner. Watches observes arbitrary secondary resources with a custom mapping. Each adds cache and goroutine cost, so watch only what you actually need to react to.
Selector vs predicate — when do I use which?
Selectors (label/field) filter at the cache/API level, so unmatched objects never consume memory — use them to shrink the cache. Predicates run in-process on events that did enter the cache, dropping them before the reconciler — use them for logic the selector can't express (e.g. "only reconcile on spec changes").
How do I pick MaxConcurrentReconciles?
By criticality and downstream capacity, not a global default. A critical, fast-reconciling resource can take higher concurrency; a noisy resource that hits a rate-limited external API should stay low. More concurrency means more goroutines and more parallel load on whatever the reconciler calls.
Webhook or ValidatingAdmissionPolicy?
For simple, expressible-in-CEL rules, prefer VAP — it runs in the API server with nothing to deploy or keep available. Reach for a validating webhook when you need logic CEL can't express or external lookups. Both enforce at admission, which is far more reliable than enforcing in controller code.
Takeaways
- "What to watch" is a design decision with a memory, CPU, and security bill. Make it consciously.
- Filter early and cheap: selectors shrink the cache, predicates drop events, namespace scoping bounds the blast radius.
- Unfiltered watches balloon the heap — the benchmark gap widens with cluster size; a selector bends the curve.
- Never leak in Reconcile — an unclosed client turns into a linear goroutine leak.
- Least privilege + shift-left: scope RBAC tightly and enforce ownership at admission with VAP.
Next in the series — Deep dive 06: The Kyverno Five, which moves from operator-level admission control to cluster-wide policy-as-code.
References
- KubeCon Mumbai 2026 — Day 1 index · the rest of the series
- Kubebuilder Book · controller-runtime · cache options, predicates, For/Owns/Watches
- Validating Admission Policy · in-API-server CEL enforcement
- Deep dive 06 — The Kyverno Five · policy-as-code at cluster scale