Continuous Agent Attestation

Methodology Overview · v1.0 · April 2026 · Released under CC-BY 4.0

Open methodology Reference implementation: enforce.html SDK: pip install neuravant-enforce

The problem

Static AI-safety evaluations — red-team passes, model cards, pre-deployment evals — certify a system at one moment in time. Production agents drift, get re-prompted, get new tools, and hit adversarial inputs the eval set never covered. A "safe" agent on Monday is an incident on Friday — and there is no audit-grade record of which action crossed the line, when, or why.

The insurance industry cannot underwrite what it cannot evidence. Without continuous, signed, tamper-evident proof of every consequential action an agent took and which control reviewed it, autonomous-agent liability is uninsurable at scale.

The proposal

Continuous Agent Attestation (CAA) is a runtime methodology that produces a cryptographically verifiable record of every consequential action an autonomous agent attempts, the policy that evaluated it, and the verdict that was rendered — in <25 ms, fail-closed, with no application-side state.

1. Pre-action interception (not post-hoc logging)

Every tool call passes through an enforcement gate before execution. Logs prove what happened; attestation prevents what shouldn't. The distinction matters at claims time: a logged breach is a paid claim; a blocked one is not.

2. Policy-as-code with severity & priority

Each policy is a declarative record (tool_pattern, arg_value_regex, action ∈ {ALLOW, BLOCK, HOLD}, severity, priority). Lowest priority number takes precedence. Policies are versioned, diff-able, and reviewable by non-engineers — including underwriters and regulators.

3. Hash-linked, HMAC-signed audit chain

Each decision record contains:

{decision_id, agent_id, tool, args_hash, verdict, policy_name,
 severity, timestamp_utc, prev_hash, signature}

where signature = HMAC-SHA256(per_tenant_key, canonical(record)) and per_tenant_key = HKDF-SHA256(master_key, agent_id). Mutating or deleting any record breaks every subsequent signature — tampering is mathematically detectable. Chain validity is a necessary condition for a paid claim.

4. Fail-closed by construction

If the gate is unreachable, the default verdict is BLOCK. Fail-open is opt-in, per-client, and itself a recorded policy decision. The economic incentive — a fail-open client foregoes coverage for that action class — drives the right default.

What CAA is not

Not a model evaluation. CAA does not score the LLM. It evaluates what the agent attempted to do, regardless of which model produced the request.
Not observability. Observability tells you what happened. Attestation tells you what was prevented, by which control, with cryptographic proof.
Not policy generation. CAA enforces policies; it does not author them. Neuravant ships a baseline catalogue (47 policies covering OWASP LLM Top 10 and MITRE ATLAS); customers extend with domain policies.

Reference architecture

┌──────────┐   tool call      ┌────────────────┐   verdict   ┌──────────┐
│  Agent   │ ───────────────► │ Enforcement    │ ──────────► │  Tool    │
│ (any LLM)│                  │   Gate         │   (allow)   │  exec    │
└──────────┘                  │                │             └──────────┘
                              │  • match       │
                              │  • sign        │   (block/hold) → exception
                              │  • chain       │
                              └───────┬────────┘
                                      │ append
                                      ▼
                           ┌──────────────────────┐
                           │  Audit Chain         │
                           │  (HMAC-linked)       │  ── verifiable by
                           │                      │     auditor / claims
                           └──────────────────────┘

Measured performance (April 2026, pilot tier): p50 = 2.1 ms, p99 = 3.1 ms at 200 RPS sustained; p99 = 44 ms at 1 000 RPS burst. Zero errors across 11 000 evaluation calls. Throughput: >4 000 decisions/sec/core.

Standards mapping

The shipped policy catalogue v1.0 maps to:

OWASP LLM Top 10 (2025) — full coverage LLM01–LLM10
MITRE ATLAS — selected adversarial-ML techniques (T0010, T0011, T0015, T0019, T0020, T0024, T0044, T0048, T0049, T0051)
NIST AI RMF 1.0 — Manage 4.1 (incident response), Measure 2.7 (security), Govern 1.5 (decisions documented)
UK AI Safety Institute Inspect framework — runtime control evidence layer

Open methodological questions

We are publishing CAA as an open methodology, not a closed product spec, because the following benefit from external scrutiny:

Canonical record format. Is flat JSON canonicalisation sufficient, or is RFC 8785 / a registered CBOR profile preferable for cross-vendor interoperability?
Key custody. HSM-backed signing vs. threshold schemes vs. transparency logs (Sigstore-style). Trade-offs differ by regulatory regime.
Cross-organisation chain attestation. When agents from two organisations interact (agent-to-agent commerce, MCP servers), whose chain is canonical? We propose mutual co-signing; alternatives welcome.
Negative-evidence claims. A clean chain proves no covered breach occurred — a stronger claim than "we didn't notice one." How should regulators treat this evidentiary asymmetry?

Reference implementation

A working reference implementation is available as a Python SDK:

pip install neuravant-enforce

Source, test suite (124+ tests, end-to-end), and audit-chain verifier (nail-verify) are open for inspection. The SDK is the same gate Neuravant uses for our own underwriting evidence — there is no "professional version."

Live audit dashboard: /enforce.html

Invitation to collaborate

Neuravant is actively seeking research collaboration on the open questions above with academic groups, government safety institutes, and standards bodies. We will co-author, share datasets (suitably anonymised), and publish negative results.

Contact: research@neuravant.ai · Dillman Hunte, Founder

Cite as:
Hunte, D. (2026). Continuous Agent Attestation: Methodology Overview v1.0. Neuravant AI Limited. Released under CC-BY 4.0. https://neuravant.ai/continuous-attestation.html