Deep dive 8 of the KubeCon Mumbai 2026 series. This talk solved a real tension: regulators demand SBOM transparency, but vendors resist publishing because an SBOM is a vulnerability map. The fix is a commitment scheme — publish a 32-byte Merkle root that proves the SBOM hasn't changed (verifiability), keep component details sealed (privacy), and reveal individual components to auditors with tiny cryptographic proofs (selective disclosure). All on SHA-256, no zero-knowledge circuits, plugged straight into the existing CNCF supply-chain stack (syft, grype, cosign, in-toto, TUF, OPA).
This is the most cryptographically dense talk in the series, but the payoff is concrete: 481-byte proofs for a 10,000-component SBOM, 2.6 million disclosures per second, 100% tamper detection. It connects directly to the supply-chain security layer Lumenore built in deep dive 01 (Syft SBOM + Cosign signing) and the policy-as-code of the Kyverno talk.
The problem — two demands that conflict
An SBOM (Software Bill of Materials) is an inventory of every component, library, and dependency in a piece of software — required by NIST SSDF, US Executive Order 14028, and the EU Cyber Resilience Act, in formats like CycloneDX and SPDX. A typical enterprise SBOM has 1,000 to 50,000 components. And here's the bind:
| Customers & regulators want | Vendors resist publishing because |
|---|---|
| Full transparency | an SBOM is a vulnerability map |
| Ability to verify | it exposes architecture decisions |
| Ongoing monitoring | it reveals supplier relationships |
Traditional approaches force a binary choice — publish everything or hide everything. The talk's whole contribution is showing that's a false dichotomy.
The real risk is targeted attack intelligence:
- Vulnerability mapping — cross-reference the SBOM against CVE databases for ready-made exploits.
- Internal architecture exposure — proprietary library names reveal design decisions.
- Supply-chain intelligence — supplier relationships become attack vectors.
- Regulatory/licensing exposure — undisclosed third-party components become legal risk.
Three properties that must coexist
The insight: with the right cryptographic structure, three seemingly contradictory properties can all hold at once.
| Property | What it guarantees |
|---|---|
| Verifiability | anyone can confirm the SBOM has not been tampered with. |
| Privacy | component details remain confidential. |
| Selective disclosure | auditors can verify specific components without seeing the rest. |
Two roots, two purposes
The scheme produces two 32-byte Merkle roots, both built from the same data with the same primitive (SHA-256):
- PUBLIC_ROOT — shared with everyone; proves the SBOM has not changed. Zero component data exposed.
- COMMITMENT_ROOT — shared with auditors only; enables selective disclosure.
Layer 1 — the internal tree (the PUBLIC_ROOT)
A Merkle tree with two domain-separated hash functions:
H_leaf(data) = SHA-256( 0x00 ‖ data ) H_node(left, right) = SHA-256( 0x01 ‖ left ‖ right )
0x00/0x01 domain bytes, an attacker could substitute an internal node for a leaf and produce the same root — a second-preimage attack. The prefix makes the leaf and node hash-spaces irreconcilable. Leaves are component hashes; internal nodes combine children pairwise up to the root; an odd leaf count duplicates the last leaf. It's deterministic — the same SBOM always yields the same PUBLIC_ROOT, and any change to any component flips it.Layer 2 — the commitment scheme (the COMMITMENT_ROOT)
For each component, three steps add a secret blinding factor:
Step 1 — Hash: h_i = H_leaf( component_i )
Step 2 — Nonce: r_i ← {0,1}^256 (random, secret)
Step 3 — Commit: C_i = SHA-256( h_i ‖ r_i )
# Build a Merkle tree over {C_0 … C_n} → COMMITMENT_ROOT
This gives three formal properties:
- Hiding (computational) — given
C_i, you can't learnh_iwithout the noncer_i. Rests on SHA-256 preimage resistance. - Binding (information-theoretic) — you can't find a different
(h', r')that produces the same commitment. Collision resistance gives ~2128 security. - Re-randomizable — generate fresh nonces to refresh COMMITMENT_ROOT without touching the component hashes or the PUBLIC_ROOT.
Worked example — sealing four components
Input: openssl:3.1.3 nginx:1.25.2 postgres:15.2 redis:7.0.11 Step 1 — leaf hashes (no randomness): h_0 = SHA-256(0x00 ‖ "openssl|3.1.3") → a3f2… h_2 = …postgres → 2b44… h_1 = …nginx → 7c91… h_3 = …redis → f801… Merkle(h_0…h_3) → PUBLIC_ROOT = 9d3a… (shared with everyone) Step 2 — random nonces (secret, never published): r_0 = os.urandom(32) → 88c1… r_1 → 3f72… r_2 → 5a10… r_3 → 0c6e… Step 3 — commitments: C_0 = SHA-256(h_0 ‖ r_0) → 4d8f… C_1 → 91b2… C_2 → c73a… C_3 → 0e5f… Merkle(C_0…C_3) → COMMITMENT_ROOT = 1f7b… (sent to auditor only)
Disclosing one component — the proof
When openssl is updated to 3.1.4, the vendor sends an auditor a disclosure bundle: the component name, the new version, h_i, r_i, and an O(log n) Merkle proof_path of sibling hashes. nginx, postgres, and redis nonces are never revealed — they stay sealed. The auditor verifies in three steps:
- Re-derive the leaf hash:
h_check = SHA-256(0x00 ‖ "openssl|3.1.4"); asserth_check == h_i. - Re-derive the commitment:
C_check = SHA-256(h_i ‖ r_i). - Walk the Merkle path: fold
C_checkwith each sibling up to the root; assert it equals COMMITMENT_ROOT.
Re-randomization — unlinkable audits
Generate new nonces and rebuild the commitment tree:
r'_i ← os.urandom(32) # new nonce C'_i = SHA-256( h_i ‖ r'_i ) # same h_i # Rebuild Merkle → new COMMITMENT_ROOT
| What changes | Status |
|---|---|
| PUBLIC_ROOT | unchanged ✓ |
| COMMITMENT_ROOT | rotated ↻ |
| Component data | unchanged ✓ |
| Nonces r_0…r_n | replaced ↻ |
This breaks linkability across audit sessions — old nonces can't be replayed to correlate disclosures over time. Cost: 1.1 ms for 1K components, 11 ms for 10K — fast enough to run on any schedule.
Performance & stress test
- Build is O(n), ~2.5 µs/component — a 10K enterprise SBOM builds in ~22 ms, negligible in CI/CD.
- Proof ops are O(log n) and sub-millisecond at every scale; proof size is 384–625 bytes even at 100K components (fits in a single HTTP request).
- Multi-auditor: 50× more auditors adds <10% overhead — near-linear. 2.6M disclosures/second throughput.
- Security: 50,000 proofs verified at 100% pass rate; tampered component data, nonce, or proof path all rejected; selective disclosure cut data sent by 72%.
Why not zero-knowledge proofs?
The talk pre-empted the obvious question — why a commitment scheme rather than ZKPs? Because pure ZKP doesn't fit real CI/CD:
| Performance reality | Operational barriers |
|---|---|
| ~8 seconds per proof — blocks a pipeline doing hundreds of builds/day | 16 MB trusted setup + Powers of Tau ceremony most platform teams can't maintain |
| Circuit recompile (~10s) on every schema change — and SBOM schemas change often | Poseidon hash ≠ SHA-256 — incompatible with Syft, Grype, every existing SBOM tool |
| Fixed at e.g. 96 packages / 8 queries — enterprise SBOMs have 5,000+ | Circom + snarkjs + Rust toolchain — not standard DevOps; needs cryptographers |
| No incremental updates — one new package forces full regeneration | 1-bit output = debugging blackout — you can't see which query failed or why |
The pipeline — every tool was already in your stack
The scheme isn't a replacement; it's the missing cryptographic link between CNCF tools you already run. The 12-step pipeline:
| Tool | Role | Standard |
|---|---|---|
| syft | SBOM generation | CycloneDX 1.6 |
| grype | Vulnerability scan | CVE / NVD |
| cosign | Image signing | Sigstore |
| in-toto | Supply-chain attestation | SLSA / in-toto |
| TUF | Root distribution | TUF spec |
| OPA | Policy enforcement | Rego |
| kind | Deployment target | Kubernetes |
| merkleSBOM | Privacy commitment | SHA-256 Merkle |
Two roles deserve emphasis because they show how integrity, distribution, and policy compose:
- TUF protects
merkle_roots.json(both roots) as a signed target, with a four-role key hierarchy (root → targets → snapshot → timestamp) and expiration-based freshness (365/90/30/1-day) so a compromised distribution channel can't serve stale or tampered roots. The Merkle root is only useful if consumers can trust they received the right one — TUF closes that gap without a PKI. - OPA proves the SBOM is safe to ship, which is different from proving it's untampered. The commitment proves integrity; OPA enforces policy (SBOM completeness, no Critical CVEs, NIST compliance) at admission via Gatekeeper. A perfectly committed SBOM full of Log4Shell is still a blocked deployment.
FAQ
What exactly does the PUBLIC_ROOT prove if it hides everything?
It proves integrity over time: the SBOM behind this 32-byte root hasn't changed, and you can detect rollbacks or tampering — without learning a single component. It's a tamper-evident fingerprint anyone can monitor.
Why two roots instead of one?
They serve different audiences. PUBLIC_ROOT is for everyone (integrity monitoring, no data). COMMITMENT_ROOT is for auditors and carries the blinded commitments that enable selective disclosure. Splitting them means public monitoring never touches the auditor machinery.
Is this as strong as a zero-knowledge proof?
For this problem, it's the right tool. Commitments give hiding + binding + selective disclosure on plain SHA-256, with sub-millisecond proofs and no trusted setup. ZKP would add a trusted-setup ceremony, a non-standard hash, 8-second proofs, and a debugging blackout — for guarantees this use case doesn't need.
How does it stop an auditor from correlating my disclosures over time?
Re-randomization. New nonces produce a fresh COMMITMENT_ROOT each session while the PUBLIC_ROOT and component data stay fixed, so old disclosures can't be linked to new ones. It costs ~11 ms at 10K components.
Takeaways
- Transparency vs IP is a false binary. A commitment scheme gives verifiability, privacy, and selective disclosure at once.
- Two roots: PUBLIC_ROOT (everyone, integrity) and COMMITMENT_ROOT (auditors, disclosure) — same data, same SHA-256.
- Seal each component in an envelope (hash + nonce → commitment); open one without revealing the rest, in a 481-byte proof.
- Re-randomize to unlink audits; domain-separate leaf/node hashes to block second-preimage attacks.
- Skip ZKP for this — plain SHA-256 commitments hit 2.6M disclosures/sec with zero toolchain overhead.
- It's the missing link, not a replacement — drops into syft/grype/cosign/in-toto/TUF/OPA, with policy (OPA) and distribution (TUF) doing the rest.
Next in the series — Deep dive 09: Keycloak Federated Client Authentication, closing the security cluster with identity.
References
- KubeCon Mumbai 2026 — Day 1 index · the rest of the series
- SBOM (CISA) · CycloneDX · SPDX · the formats & mandates
- in-toto · TUF · OPA · the surrounding CNCF stack
- Commitment schemes · hiding, binding, the cryptographic primitive