Home/Blog/Digital Provenance
Software Supply Chain

Digital Provenance, SBOMs, And Signed AI-Generated Code

AI agents can now draft code, update dependencies, generate containers, and modify deployment files in minutes. That speed is useful only if production can answer a boring but critical question: who produced this artifact, from which source, with which dependencies, under which workflow, and has it changed since?

Provenance turns AI output into evidence

Generated code is not special because it came from an AI model. It is special because it can change more code paths, faster, with less human typing. That makes supply-chain evidence more important, not less. A pull request comment saying "the agent wrote this" is not provenance. A signed attestation tying the artifact to source, commit, workflow, builder, dependencies, and digest is much closer to something a deployment gate can verify.

SLSA focuses on provenance for build integrity. Sigstore provides signing and transparency tooling for artifacts. CycloneDX covers SBOMs and ML-BOMs so software, models, datasets, and dependencies can be represented in a structured way. Together, they give AI-generated code a verifiable trail.

AI code verification pipeline
Agent changeCode, tests, prompts, dependencies, or deployment files are modified.
ReviewHuman approves behavior, risk, and architecture impact.
BuildCI produces artifact from pinned source and workflow identity.
DocumentSBOM, ML-BOM, licenses, models, datasets, and dependency graph.
SignArtifact and attestations are signed and logged.
VerifyAdmission policy checks digest, provenance, signer, and risk before deploy.

SBOM is the ingredient list, provenance is the receipt

An SBOM tells you what is inside. Provenance tells you how it was produced. A signature tells you whether the thing you are deploying still matches what was signed. AI-era delivery needs all three because generated changes often touch glue code, transitive dependencies, prompt templates, generated SDKs, and container layers at the same time.

For AI systems, the inventory expands. It is not enough to list npm or Python dependencies. You also need model name and version, dataset lineage, embedding model, prompt template version, eval suite, tool permissions, and runtime policy.

SBOMWhat packages, licenses, files, and components are present?
ML-BOMWhich models, datasets, frameworks, and training or eval metadata matter?
AttestationWho built it, from what source, using which workflow and builder?
SignatureCan production verify artifact identity before it runs?

What should be signed

The mistake is signing only the final container and calling it done. A useful AI supply chain signs the container image, SBOM, model artifact, eval report, deployment manifest, and provenance statement. The signature is not just ceremony; it allows an admission controller or deployment script to reject unknown artifacts before they touch production.

ArtifactWhat it provesPolicy gate
Source commitWhich repository revision introduced the change, including agent-authored files.Require protected branch, review, and workflow identity.
Build provenanceWhich builder, workflow, inputs, and digest produced the artifact.Accept only trusted CI builders and expected repo paths.
SBOMWhich software dependencies, licenses, and components are included.Block banned licenses, vulnerable packages, and unknown registries.
ML-BOMWhich models, datasets, embeddings, frameworks, and eval metadata are involved.Block unapproved models or missing data lineage.
Container signatureWhether the image digest matches a trusted signature.Deploy only signed digests, never mutable tags alone.
Eval attestationWhich tests, scans, and model evals passed for this exact artifact.Require minimum coverage, security scan, and eval threshold.

AI agents need identity in the trail

When an agent makes a change, the provenance trail should not pretend a human typed every line. The commit and pull request should show the human approver, the agent or workflow identity, the tools used, and the files touched. The point is not blame. The point is replayability: if a model upgrade or agent policy causes bad output, you need to find every artifact built under that context.

This is where AI governance meets DevSecOps. Agent output should become an input to the same verification fabric as human code: branch protection, tests, SBOM, signed provenance, artifact verification, and runtime monitoring.

What I would build

I would build an "AI artifact passport" for every deployable service. The passport would combine source commit, human reviewer, agent identity, prompt or task reference, SBOM, ML-BOM, SLSA provenance, cosign signature, eval results, vulnerability scan, and deployment environment. Production would reject releases without a valid passport.

The best version would be boring to use: one CI reusable workflow emits the evidence, one policy engine verifies it, and one dashboard shows what is trusted, what is expired, and what cannot be reproduced.

The design principle

AI-generated code does not need a separate trust universe. It needs stronger evidence inside the software supply chain we already have. If a system cannot prove where an artifact came from, what it contains, who approved it, and why production accepted it, then speed has outrun accountability.