KubeCon India 2026 (Mumbai) — Day 1 Deep Dives

05 · Beyond the Primary CR — What and What Not to Watch

Deep dive 5 of 17 · Platform engineering & app delivery

Jun 18, 2026 · conferences · 22 min read · 5000 words advanced

Beyond the primary CR — what (and what not) to watch in Kubernetes operators.

conferences kubecon operators controller-runtime go

Deep dive 5 of the KubeCon Mumbai 2026 series. Guna K Kambalimath and Kishen V (IBM) went under the hood of Kubernetes operators to answer a question most operator authors never consciously decide: which resources should your controller actually watch and cache — and which should it deliberately ignore? Watch too much and you get ballooning memory, goroutine leaks, log storms, and dangerous over-privilege. The talk is a masterclass in the controller-runtime cache, field/label selectors, predicates, MaxConcurrentReconciles, and shift-left security with Validating Admission Policies — all illustrated with benchmarks and a running "AI log analyzer" operator.

If deep dive 04 showed the friendly abstractions developers see, this one is what's underneath: the operators that implement those abstractions, and the very real performance and security cliffs you fall off if you build them naively. It's the most code-heavy talk in the platform cluster, and the most quietly important — these are the bugs that take down operators in production.

The analogy — every controller is a reconcile loop

The grounding analogy was the built-in Deployment controller: it watches Deployments and the Pods/ReplicaSets they own, receives events (e.g. "pod deleted"), and runs a reconcile loop — does the desired state hold (are there n replicas)? If yes, do nothing; if no, create the pods needed. Every operator is a variation on this loop. The design question the talk drills into is the input side: what feeds events into that loop, and at what cost?

The running example — an AI log analyzer operator

The talk used a custom AILogAnalyzer operator throughout. Its anatomy maps to the three categories of resource every operator deals with:

  • Primary resource (the CR): AILogAnalyzer — the custom resource the operator owns, specifying desired state (scan_interval: 15m, alert_channel, model), backed by a CRD.
  • Watched secondary resources: things it observes but doesn't own — e.g. a security-backend Deployment (labelled app: security) whose logs it analyzes.
  • Owned secondary resources: things it creates and controls via ownerReferences — e.g. a security-analyzer-engine Deployment it spins up.

The reconcile logic: get the pods of a watched deployment (from cache), pull the last 15 minutes of logs, call an AI endpoint, return a critical incident if any, trigger alerts, and update the CR's status on failure. Simple enough — until you ask what it costs to watch all those resources.

Zooming into the cache — and the four ways to narrow the watch

The key mental model. controller-runtime keeps an in-memory cache backed by a streaming watch (a long-lived HTTP connection to the API server). Every object that matches your watch is held in that cache — consuming memory — and every change generates an event that may wake your reconciler. So "what you watch" directly sets your memory footprint and your CPU/queue load. The whole talk is about controlling that.

The four levers to narrow the watch:

  • Field and Label Selectors — only cache objects matching specific fields (e.g. status.phase=Running) or labels (e.g. app: security).
  • Namespace scoping — only watch the namespaces you care about, not the whole cluster.
  • Sync Period — how often the cache does a full resync (a periodic re-reconcile of everything).
  • Predicates — code-level filters that drop events before they reach the work queue.
Inside the controller — the filtering pipeline Field & Labelselector Cache Predicate(CRUD / sync) Work queuegoroutine+chan Reconcilerbusiness logic selectors shrink the cache · predicates drop events · queue feeds the reconciler

Fig 1 — events flow API → selector → cache → predicate → work queue → reconciler. Filter early, filter cheap.

"Don't cache the noise" — selectors in code

The first concrete technique: scope the manager's cache with label and field selectors so unrelated objects never enter memory at all.

requirement, _ := labels.NewRequirement("app", selection.In,
    []string{"IAM", "security-backend", "log-analyzer"})
labelSelector := labels.NewSelector().Add(*requirement)

mgr, err := ctrl.NewManager(ctrl.GetConfigOrDie(), ctrl.Options{
  Cache: cache.Options{
    DefaultLabelSelector: labelSelector,
    DefaultFieldSelector: fields.SelectorFromSet(fields.Set{
      "metadata.namespace": "iam-security",
    }),
    ByObject: map[client.Object]cache.ByObject{
      &corev1.Pod{}: {            // per-resource override
        Field: fields.SelectorFromSet(fields.Set{"status.phase": "Running"}),
      },
    },
    SyncPeriod: ptr.To(20 * time.Second),
  },
})

And a predicate to drop events the selector can't express, before they reach the reconciler:

func ignoreSpecificLabels() predicate.Predicate {
  return predicate.NewPredicateFuncs(func(object client.Object) bool {
    labels := object.GetLabels()
    if labels == nil { return false }            // no labels → don't reconcile
    if appLabel, ok := labels["app"]; ok {
      if appLabel == "IAM" || appLabel == "security-backend" {
        return false                             // block these
      }
    }
    return true                                  // allow everything else
  })
}

The three payoffs the talk named:

  • Zero-waste memory — unready/pending metadata never enters the cache.
  • Reduction in API noise — intermediate lifecycle chatter is muted, so the operator reconciles only when it truly needs to.
  • Protection from log storming — the work queue is insulated from application churn and the endless retry loops of crashing/unavailable workloads.

The benchmark — what ballooning memory actually looks like

This was the slide that justified all the fiddly selector code. Benchmarking an operator watching 500 labeled vs 500 unlabeled resources while scaling StatefulSets to 1,000:

MetricWithout selectorWith selector
Heap object count (GOGC=off, 1000 STS)~900,000+~640,000
Heap memory in use (GOGC=off, 1000 STS)~118 MB~79 MB
The lesson, quantified. An unfiltered watch doesn't just cost a little more — its heap grows faster and higher, and the gap widens as the cluster grows. On a large cluster, "watch everything" is how an operator slowly OOMs. A label/field selector is a one-line change that bends the curve. Note the GC setting matters too (GOGC=off vs 100 changed the absolute numbers), but the relative win from filtering held in both.

More watches, longer queues — and the goroutine bill

Memory isn't the only cost. Each watched resource type spins up machinery, and the talk measured the goroutine growth precisely:

What you watchGoroutines
Baseline (For(&CustomApp{}))83
+ Watches(Deployment)93 (+10)
+ Watches(StatefulSet)103 (+10)
+ Watches(PersistentVolumeClaim)123 (+20)

And concurrency multiplies it — MaxConcurrentReconciles adds worker goroutines per controller. The talk's point: tune concurrency by resource criticality, not uniformly. Don't use a one-size-fits-all queue; give a critical resource more concurrent reconciles and a noisy-but-unimportant one fewer.

The goroutine leak that scales linearly with CRs

The scariest slide: a leak caused by creating a client connection inside the reconcile loop without closing it.

func (r *CustomReconciler) Reconcile(ctx, req) (ctrl.Result, error) {
  var cr myapi.CustomApp
  if err := r.Get(ctx, req.NamespacedName, &cr); err != nil { ... }

  client, err := externalapi.NewClient(cr.Spec.Endpoint)  // new conn each reconcile
  if err != nil { return ctrl.Result{}, err }
  // defer client.Close()  ← MISSING: goroutines pile up forever
  ...
}

The measured effect: goroutines climbing 83 → 239 → 395 → 551 as CRs increased — an unbounded leak. The fix (one-time client init, reused across reconciles, or a proper defer client.Close()) flattened it to 83 → 98 → 98 → 98 — a one-time initialization cost independent of CR volume.

The single most actionable bug to remember. Never open an unclosed connection (HTTP client, DB handle, gRPC channel) inside Reconcile. The reconcile loop runs constantly; anything you allocate without releasing leaks at the rate of your event stream. Either reuse a long-lived client or defer Close() religiously. This one mistake quietly takes down more operators than any algorithm.

Security — over-privileged controllers and shift-left enforcement

The final act connected watching to security. The failure mode: an over-privileged controller. The watch logic intends to touch only app: security-backend Deployments, but the RBAC Role grants get/watch/list/update on all Deployments. Add a reconcile-loop bug that ignores labels internally, and the controller can modify a Deployment it was never supposed to — e.g. accidentally mangling the payments service.

The fix is defense in depth, and it's shift-left — enforce at admission, not in fragile controller code. Two options:

A validating webhook

A webhook that intercepts Deployment updates, checks the target label and ownerReferences, and enforces that only the operator's service account can modify operator-owned Deployments — denying manual updates with "Strictly managed by the Operator."

A Validating Admission Policy (VAP) — no webhook server needed

apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionPolicy
metadata: { name: "operator-only-deployment-updates" }
spec:
  failurePolicy: Fail
  matchConstraints:
    resourceRules:
      - apiGroups: ["apps"]
        apiVersions: ["v1"]
        operations: ["UPDATE"]
        resources: ["deployments"]
  validations:
    - expression: >
        !(object.metadata.labels['app'] == 'log-analyzer' &&
          object.metadata.ownerReferences.exists(owner, owner.kind == 'LogAnalyzerOperator')) ||
        request.userInfo.username == 'system:serviceaccount:log-analyzer-system:log-analyzer-operator'
      message: "This deployment is strictly managed by the Operator. Manual updates are forbidden."
Why VAP over a webhook? A ValidatingAdmissionPolicy runs the CEL expression in-process in the API server — no webhook server to deploy, scale, secure, or keep available. For policy this simple ("only the operator's SA may update its own Deployments"), VAP is the lighter, more reliable choice. This is the same admission-control theme the Kyverno deep dive builds on next.

Key takeaways — the configurable options

  • Memory optimization: cache only the resources you care about — label/field selectors, namespace scoping, per-object overrides.
  • Resource-vs-latency balancing: don't use a one-size-fits-all queue; set MaxConcurrentReconciles strategically by resource criticality.
  • Event handling: protect the work queue from noise — custom predicates that intercept and drop unnecessary events before they reach the reconciler.
  • Shift-left security: use native Validating Admission Policies to catch and reject updates from non-owners, rather than relying on (fallible) controller code.
  • Don't leak in Reconcile: never open an unclosed connection inside the loop — reuse or defer Close().

FAQ

What's the difference between For, Owns, and Watches?

For declares the primary resource the controller reconciles. Owns watches resources the controller created (via ownerReferences) and maps their events back to the owner. Watches observes arbitrary secondary resources with a custom mapping. Each adds cache and goroutine cost, so watch only what you actually need to react to.

Selector vs predicate — when do I use which?

Selectors (label/field) filter at the cache/API level, so unmatched objects never consume memory — use them to shrink the cache. Predicates run in-process on events that did enter the cache, dropping them before the reconciler — use them for logic the selector can't express (e.g. "only reconcile on spec changes").

How do I pick MaxConcurrentReconciles?

By criticality and downstream capacity, not a global default. A critical, fast-reconciling resource can take higher concurrency; a noisy resource that hits a rate-limited external API should stay low. More concurrency means more goroutines and more parallel load on whatever the reconciler calls.

Webhook or ValidatingAdmissionPolicy?

For simple, expressible-in-CEL rules, prefer VAP — it runs in the API server with nothing to deploy or keep available. Reach for a validating webhook when you need logic CEL can't express or external lookups. Both enforce at admission, which is far more reliable than enforcing in controller code.

Takeaways

  • "What to watch" is a design decision with a memory, CPU, and security bill. Make it consciously.
  • Filter early and cheap: selectors shrink the cache, predicates drop events, namespace scoping bounds the blast radius.
  • Unfiltered watches balloon the heap — the benchmark gap widens with cluster size; a selector bends the curve.
  • Never leak in Reconcile — an unclosed client turns into a linear goroutine leak.
  • Least privilege + shift-left: scope RBAC tightly and enforce ownership at admission with VAP.

Next in the series — Deep dive 06: The Kyverno Five, which moves from operator-level admission control to cluster-wide policy-as-code.

References

← prev: kubevela next: the kyverno five →
© cvam — written in plaintext, served warm