Home/Blog/AI PR Provenance Software Supply ChainSoftware Supply Chain Provenance For AI-Generated Pull Requests
A signed artifact can prove which workflow built a commit. It cannot, by itself, explain why an AI agent created that commit, whose authority started the task, which tools influenced the diff, or which human accepted the result.
Published Jun 8, 202613 min readAgent-Generated Change
AI adds a new source stage before traditional build provenance
SLSA defines provenance as verifiable information describing where, when, and how an artifact was produced. GitHub artifact attestations can cryptographically connect an artifact to a repository, commit SHA, workflow, and build environment. Those controls are essential, but an AI-generated pull request adds another provenance question: how was the source change itself produced?
The answer should not be a vague label such as “AI-assisted.” A useful source record connects the approved task, responsible principal, agent identity, model and tool versions, retrieved context, generated files, failed attempts, tests, review decisions, and final commit. Build provenance then continues the chain from reviewed source to artifact.
AI-generated PR provenance chain01 IntentApproved taskSpecification, owner, risk class, permitted repositories, tools, and acceptance criteria.
02 GenerationAgent recordAgent identity, model, tools, context sources, permissions, prompts, and session.
03 ChangeGenerated diffFiles, dependencies, migrations, tests, generated-code manifest, and commit.
04 ReviewHuman decisionReviewers, findings, exceptions, approvals, and protected-branch result.
05 BuildTrusted builderWorkflow identity, source SHA, materials, parameters, isolation, and result.
06 DescribeSBOM and attestationsComponents, licenses, vulnerabilities, build provenance, tests, and policies.
07 SignVerifiable artifactDigest, identity-based signature, certificate, transparency entry, and registry.
08 DeployRuntime lineageEnvironment, approval, deployed digest, policy verification, owner, and rollback.
Generated code needs a manifest, not a stigma
AI labels should help review and incident response, not create a false binary between “human code” and “AI code.” Humans use autocomplete, generated clients, migrations, templates, and agents in the same pull request. Record which files or hunks were generated or materially transformed, which system did it, and which human accepted responsibility.
The manifest should also capture dependencies and external context introduced by the agent. If generated code copied an unsafe pattern from retrieved documentation or added a compromised package, investigators need to reconstruct that influence without storing sensitive prompts forever.
AttributionWhich principal, agent, reviewer, builder, and deployer were responsible?
IntegrityDid reviewed source, built artifact, SBOM, signature, and deployed digest remain linked?
ReproducibilityCan the build and important generation context be reconstructed or compared?
PolicyCan admission verify provenance, signatures, components, reviews, and exceptions?
SBOM and provenance answer different questions
An SBOM describes components inside software. Build provenance describes how an artifact was produced. A signature binds an identity to a digest or attestation. Review records explain why the source was accepted. Deployment metadata says where the artifact runs. None replaces the others.
GitHub can generate attestations for build provenance and SBOMs. Sigstore supports identity-based signing with short-lived certificates, and Cosign verifies in-toto attestations against policies. Together, these enable a deployment gate to ask not merely “is this image signed?” but “was this exact digest built from the approved repository and commit by an allowed workflow, with required SBOM and review evidence?”
| Evidence | What it proves | What it does not prove | Verification gate |
|---|
| Agent task record | Why work began, who delegated it, and what the agent was allowed to do. | That the resulting code is correct or safe. | Require approved task, bounded permissions, and responsible owner. |
| Generated-code manifest | Which change surfaces and tools involved AI generation. | That every influence or copied pattern is known. | Require disclosure for high-risk files and dependency changes. |
| Pull-request review | Which humans and checks accepted the source change. | That the later artifact matches reviewed source. | Protected branches, required checks, CODEOWNERS, and exception policy. |
| SLSA build provenance | Where, when, and how an artifact was built from source. | Why the source was created or approved. | Verify builder identity, source SHA, parameters, and artifact digest. |
| SBOM attestation | Which declared components and dependencies are in the artifact. | That components are safe or complete without verification. | License, vulnerability, policy, and component allowlist checks. |
| Signature and deployment record | Identity bound to artifact and where the digest was deployed. | That runtime behavior remains safe after deployment. | Admission policy, digest pinning, environment approval, runtime monitoring. |
Policy should verify the chain at promotion time
Evidence that nobody verifies is documentation, not a control. Pull-request rules should require source evidence. Trusted builders should emit provenance and SBOM attestations. Registries should preserve immutable digests and signatures. Deployment admission should reject artifacts without the approved chain or with expired exceptions.
Verification also needs failure modes. If provenance storage is unavailable, should production deployment stop? Which emergency path exists? Who can approve it, for how long, and what evidence must be backfilled? The exception path is part of the supply chain.
What I would build
I would build a provenance ledger keyed by pull request, commit, and artifact digest. It would join task and agent metadata, generated-code manifest, reviews, checks, SBOM, SLSA provenance, signatures, storage, deployments, and runtime incidents.
The primary view would answer a production question in both directions: starting from a deployed artifact, trace back to the approved task and agent session; starting from an agent or compromised tool version, find every affected pull request, artifact, and deployment.
The design principle
AI generation does not weaken the value of software provenance; it extends the chain that must be proven. Preserve intent and source-generation evidence before the commit, then use trusted builds, SBOMs, attestations, signatures, and deployment records to keep that evidence connected to production.