Anthropic wrote the spec for agent identity. Most teams can't build it.

Anthropic wrote the spec for agent identity. It reads like a build plan for a system most teams don't have.

Buried on page 9 of Anthropic's new Zero Trust for AI Agents is the most honest sentence I've read about agent security this year:

"Agents often operate with elevated privileges or service accounts, and traditional identity systems designed for human users struggle to accommodate them. This mismatch creates exploitable security gaps."

That's the whole problem in two lines, and almost nobody is saying it out loud.

For thirty years we built identity systems for humans. A person logs in, gets a role, holds a session, leaves a trail. Every IAM product, every SSO flow, every audit log assumes a human on the other end. Then we handed those same systems to software that runs a thousand actions a minute, interprets ambiguous instructions, calls tools we didn't anticipate, and never sleeps. The fit was never going to be clean.

In practice, "give the agent access" usually means one of three things: a service account, a long-lived API key pasted into a config file, or worse, an OAuth flow where scopes are pre-determined and the agent ends up impersonating a real human identity. Both are identity hand-me-downs. Both work in the demo. And both are exactly what fails the moment an agent does something it wasn't supposed to.

Why the mismatch is dangerous (not just untidy)

Anthropic's Zero Trust for AI Agents is blunt about the threat. Prompt injection — getting an agent to follow an attacker's instructions instead of yours — reaches 100% success rates in research, with prompts that transfer across model families. And Anthropic states plainly that LLMs cannot reliably distinguish informational context from actionable instructions. So you should assume that, sooner or later, your agent will be talked into doing the wrong thing using the access you legitimately gave it.

Here's the part that should worry a security team most. When that happens, your existing tooling won't catch it. As Anthropic's Zero Trust for AI Agents puts it on tool-chaining attacks: "because every command executes through trusted binaries under valid credentials, host-centric monitoring sees no malware and the misuse goes undetected." No exploit. No payload. No anomaly your EDR recognizes. Just an authorized identity doing authorized things in an unauthorized order.

You can't patch your way out of that. The only lever left is what the agent is able to do in the first place — what OWASP and Anthropic call least agency: restricting not just what an identity can access, but what each tool can do, how often, and where.

What Anthropic actually prescribes

To its credit, the document doesn't stop at naming the gap. Part III is a detailed spec for an identity and access stack built for agents instead of humans. Walk just one of the tier tables — agent identity — and you can feel the weight of it:

Foundation: every agent gets a unique, cryptographically-rooted identifier, not just a label. Identity forgery has to be genuinely hard.
Enterprise: X.509 certificates per agent, with full lifecycle management — issuance, rotation, revocation.
Advanced: hardware-bound credentials with attested issuance, so stolen credential material can't be exported from a compromised host.

Then the credential table piles on: short-lived tokens from an identity provider, expiring in minutes, refreshed automatically, never embedded in code. And a line that quietly invalidates how most agent integrations work today:

"Static API keys and shared service-account passwords are among the first things an attacker with model-assisted code analysis will find; they are no longer a legitimate entry point, not even at Foundation."

If you're running API keys with rotation, Anthropic's Zero Trust for AI Agents tells you to "treat it as a known gap."

That's just two tables. There are more — access control with deny-by-default and ABAC, privilege scoping with JIT/JEA, resource isolation, immutable audit with full request-chain traceability.

The spec is a stack, not a setting

Read those eight pages as an engineer and the real cost comes into focus. None of this is a checkbox. It's an identity provider, a secrets manager, a policy engine, a certificate authority, and an audit pipeline — wired together and applied across every agent you run and every system it can reach. That's a multi-quarter infrastructure project, and it has to be maintained as your agents and integrations multiply.

So most teams won't build it. They'll read the framework, nod, and go paste the service-account key — the thing the document explicitly calls already-compromised. The gap between the spec and the day-to-day reality is enormous, and it's widening as agents ship faster than the controls around them.

Where this has to live

Here's where I've landed while building Hodor in this exact space: agents need their own identity layer, and it has to sit at the boundary — between the agent and the systems it calls. That's not an aesthetic preference. It's the only place least agency is actually enforceable, and the only place the audit trail is real. A poisoned agent won't report itself; the system on the other side of the call has to be the thing that says no and writes it down.

Securing the agent's runtime is necessary — sandbox it, gate its file and network access, isolate its context. But sandboxing the agent is not the same as securing your Salesforce, your Postgres, or your Gmail from the agent. Those are two different halves of the problem, and the second one is the one that maps to every tier table in Part III.

The productized version

That second half is what we're building at Hodor. A distinct identity per agent. Short-lived tokens. Least-agency policy enforced on every tool call. The real credentials never touching the agent. And one immutable audit log across all of it. The tier tables in Anthropic's Zero Trust for AI Agents, turned into a few clicks instead of five systems stitched together by hand.

Anthropic wrote the spec last month. We'd been building it for a year.

Good to know we read the threat model right.