The threat agentic AI introduces to your credentials

AI coding agents are starting to want production access. Your Claude Code session wants to look at the real database to debug a slow query. Cursor wants to deploy. An MCP server in your IDE wants to read your secrets so it can fill in template variables. The work the agent does is real, and the access it needs to do that work is real.

But the default way of handing it that access (paste your credentials into the agent's environment, drop the Stripe key into .env, hand the model session the same long-lived API token your backend uses) has a structural problem. The credential's exposure surface is now everywhere the AI client touches. It sits in the agent's process memory. It rides along in whatever telemetry the AI provider records. It is one screenshot or one error report from showing up somewhere it should not.

The AI agent features most secrets managers ship today do not change this. They give the AI client a scoped API token, and that token lets the AI read plaintext secret values from your vault via an MCP server. The fence is the token's scope. The credentials themselves still cross into the AI session, fetched on demand instead of pasted in upfront.

SikkerKey took the opposite approach. The AI agent is its own first-class identity in your vault, with its own keypair, its own scope set, and a surface area that structurally does not include plaintext credential reads. It can manage your vault (approving machines, configuring rotation, planting canaries, reading audit logs) without ever holding the value of a credential. Of the major purpose-built secrets managers shipping AI agent integrations today, SikkerKey is the only one where that property holds across the full management surface.

This post is how the model works.

The bootstrap: one command, key never leaves your laptop

When you provision an AI agent in your SikkerKey vault, three things happen:

A row is created in ai_agents, its own table, distinct from machines. The AI agent lives there from the moment it registers; it is not, and never becomes, a machine.
The dashboard issues a 10-minute bootstrap token carrying the scope set and project allowlist you picked at provisioning.
You run sikkerkey-mcp install <token> on the laptop or container where the AI client lives. The install generates an Ed25519 keypair locally, posts only the public key to SikkerKey, and writes the private key to a file readable only by your user account.

ai_agent_modal

From that point on, every tool call from the AI client to SikkerKey is an Ed25519 signed request, signed with the keypair that lives on your machine and nowhere else. There is no shared API token between the AI client and SikkerKey, ever, at any point in the lifecycle. Compromising the AI client's environment exposes the same blast radius the AI had, but not a token an attacker can carry to a different environment.

The shared-bearer-token pattern most other vendors ship has the opposite property: a leaked MCP config token is a leaked AI identity. SikkerKey's keypair is bound to the machine that generated it.

The plaintext-blind contract

SikkerKey's secret-reading routes look up authenticated clients in machines. AI agents do not exist in machines; they exist in ai_agents. The route's lookup misses for an agent, so the route does not return a value, because it has no row to return. The boundary is structural. There is no "if actor type is AI, refuse" line of code anywhere, because there does not need to be. Two separate tables, two separate auth checks, and the routes that handle secret plaintext can only see one of them.

When the MCP server starts up and the AI client connects, the first message the model sees is the SikkerKey contract: this identity is distinct from machine identities; no tool returns the plaintext of an existing secret; write actions take plaintext as input, encrypt it server-side, and do not round-trip the value back.

Concretely:

manage_secrets.create accepts a plaintext value, encrypts it under a per-secret data key wrapped by your project's master key, and returns {id, name}. The new value never appears in the response.
manage_secrets.update_value follows the same shape. Plaintext in, {id, version} out.
manage_secrets.rotate accepts no plaintext. The new value is generated server-side from a charset and length you specify, applied, and a {id, version} response returned. The agent triggers the rotation but never sees the result.
manage_secrets.list and manage_secrets.get return metadata only. There is no value field in either response shape; there is no flag, query parameter, or alternate route that adds one.
Canary planting works the same way. The server generates the canary's value (64 random characters), wraps it in envelope encryption, and returns {id, name}. The agent that plants the canary cannot read what it just planted, which is the correct behavior. Canary values should only be discoverable by an attacker scraping your real credentials.

The one explicit exception is manage_temporary_secrets.create: a one-shot self-destructing share-link credential intended for an AI agent to deliver a credential to a human recipient (e.g. "share this onboarding password with the new hire"). Opening the link from anywhere destroys the secret. That is the only path on the AI surface where the agent ever returns a credential, and it returns it to be passed along, not consumed.

22 scopes, locked at provisioning

SikkerKey's AI agent surface exposes roughly 100 actions across 16 tool categories: machines, AI agents, projects, secret metadata, secret writes, per-machine grants, canary secrets, access policies, audit log, alerts, webhooks, IP allowlist, trash, enrollment tokens, support, and self-introspection.

Each action is gated by one or two named scopes (machines.read, projects.secrets.write, audit.read, alerts.write, and so on). You pick those scopes when you provision the agent. The scope set is locked the moment the agent registers. Neither the agent itself nor any peer AI agent can widen its own scopes. Widening requires the human-owned dashboard.

This is not a runtime "is this a privilege escalation?" check. The scope-mutation endpoints are simply not on the AI surface. They live in the dashboard, where humans go to make permission decisions. The boundary is structural again. There is no API path for an AI agent to grant itself or a peer more access.

Defenses that fire when the agent misbehaves

The plaintext-blind property closes the highest-risk failure mode. The other defenses cover the rest.

Approval gate. A newly bootstrapped agent lands as pending. It can authenticate (the signature verifies), but every gated route returns 403 until you approve it from the dashboard. There is no path on the AI surface for an agent to approve itself.
IP allowlist. If your vault has IP allowlisting enabled, AI agents are gated by it identically to machines. An agent's signed request from an IP outside the allowlist returns 403 before any scope check.
Canary secrets. You can plant canaries in your projects as deliberate trip-wires. A machine reading one freezes the project before the response is built. AI agents can configure canaries but cannot read them, so any read against a canary is real signal, never an AI-side false positive.
Named audit attribution. Every action the AI agent performs lands in the audit log with the agent's identity recorded by name. The agent's own whoami tool returns the last 20 entries attributed to it, so the model can introspect what it just did. Your dashboard's audit view shows the same trail under the same name.

Revocation is a click

If you lose trust in an agent for any reason (your laptop was stolen, the AI client misbehaved, the scope you granted turned out to be wider than you needed), revoke from the dashboard. The agent's row is marked revoked; every subsequent signed request returns 403 immediately. The local install on the laptop becomes inert. Other agents on other devices are unaffected.

Compare to the shared-bearer-token model: revocation often means rotating the token, which means re-pasting the new token into every AI client config that used it. That's hours of operational work, and during that window the old token is still valid for any attacker who copied it.

In SikkerKey's model, revocation is one row update, instant, and isolated to the agent you're killing.

What this gives you

The threat model for AI agents in your stack is not theoretical. The AI provider sees session traffic. The local agent process has whatever permissions the user account that runs it has. Any token the AI holds is a token an attacker who reaches the agent's environment also holds.

SikkerKey is built around the AI agent as a separate identity class with structurally bounded reach. The agent gets enough authority to be useful at managing your vault: approving machines, rotating secrets, planting canaries, reading audit history. It never holds the actual credentials your production systems trust. When you revoke the agent, the worst case is the bounded set of administrative actions it could have performed with the scopes you gave it. The credentials themselves never crossed into the AI session.

The bootstrap is one command. The revocation is one click. Everything in between is signed, scoped, and logged.