Security

The LLM Should Never See Your Tokens

Every AI agent platform that puts real OAuth tokens in the LLM context window has a critical vulnerability. Opaque credential handles eliminate the entire attack class.

Jyhad Aamri, Architect of Decision Systems9 min read

The Credential Exposure Problem

Your AI agent needs to check Google Calendar, update a Salesforce record, and post to Slack. To do any of that, it needs OAuth tokens. Real ones. The kind that grant full access to your accounts.

Here is what happens in most AI agent platforms: the token is decrypted from the database, passed into the tool code, and used to make the HTTP request. Straightforward. And catastrophically insecure.

Because that token does not stay in the HTTP request. It appears in the LLM context window. It appears in tool results. It appears in error messages. It appears in the event stream that feeds the frontend. It appears in audit logs. It appears in conversation history that persists across sessions.

A real OAuth token, sitting in the context window of a model that accepts arbitrary text input. That is not a hypothetical risk. That is an open door.

If the LLM can see a token, a prompt injection can extract it. Full stop.

Five Vectors, One Vulnerability

A raw credential in the agent context creates five distinct attack vectors. Each one is independently exploitable. Each one leads to the same outcome: full account takeover of the connected service.

Vector 1: Prompt Injection. An attacker crafts input that instructs the LLM to include the token in its text response. The model does not know the token is sensitive. It is just another string in context. A well-crafted injection can extract it in a single turn. The user (or any connected client) sees the raw token in the chat output.

Vector 2: Result Echo. External APIs sometimes echo request headers in the response body. The agent calls an API with an Authorization header. The API returns a response that includes that header value. The response goes into the tool result. The tool result goes into the LLM context. Now the token is in context twice: once from the request, once from the response.

Vector 3: Error Leakage. An HTTP request fails. The exception includes the full request object, including headers. The error message is logged. It is also returned to the LLM as a tool error result. The LLM now has the token in an error message it will attempt to reason about. Some models will even quote the token back to the user while explaining what went wrong.

Vector 4: Event Stream Exposure. Most agent platforms relay tool results to the frontend via WebSocket or server-sent events so users can watch the agent work in real time. Every tool result that contains a token broadcasts that token to every connected client. Anyone with access to the WebSocket connection sees the raw credential.

Vector 5: Cross-Run Persistence. Conversation history often persists across agent runs so the agent maintains context. A token that appears in a tool result in run one is still in the conversation history in run two, run three, and every run after that. The token persists indefinitely, even if the original credential has been rotated. Old tokens in conversation history become a historical record of every credential the agent has ever used.

Five vectors. All stemming from one architectural decision: putting real tokens where the LLM can see them.

The Standard "Fix" Does Not Work

The typical response to credential exposure is output filtering: scan the LLM output for token-like patterns and redact them before they reach the user. This helps. It is also insufficient as a primary defense.

Output filtering is reactive. It attempts to catch tokens after they have already been placed in a dangerous location. It relies on pattern matching, which means it only catches patterns it knows about. A new provider, a new token format, a base64-encoded token, a token split across two messages: each is a potential bypass. Filtering is a safety net. It is not a floor.

The real solution is not to filter tokens out of the context. It is to never put them in the context in the first place.

Opaque Handles: The Architecture

HeartBeatAgents uses a credential broker that interposes between the agent and every authenticated HTTP request. The broker replaces real tokens with opaque handles. An opaque handle is a cryptographically random string that is meaningless outside the broker process.

When an agent needs to call an external API, here is what happens:

Step 1: Registration. The real OAuth token is decrypted from the database and stored in the broker's in-process memory. The broker generates a cryptographically random handle and returns it to the tool code. The real token never leaves the broker. The handle is the only thing the agent sees.

Step 2: Authenticated Request. The tool passes the handle to the broker along with the HTTP request details. The broker looks up the real token, injects it into the Authorization header, and makes the request. The response is returned to the tool code. The response contains the API result. It does not contain the request headers. The real token is never in the return value.

Step 3: Revocation. When the agent run ends, every handle is revoked and every stored token is cleared from memory. This happens in a guaranteed cleanup block that executes even if the run crashes. No handles survive the run boundary. No tokens linger in memory.

The result: the LLM context window contains handles, not tokens. A handle looks like a random string. It grants no access to any API. It cannot be used outside the broker process. It cannot be used after the run ends. It cannot be used by another agent. It is, by design, worthless to an attacker.

A compromised LLM can extract the handle. It gets a random string that authenticates nothing.

Handle Properties

Opaque handles are designed with five properties that make them safe to expose:

Cryptographically random. Generated using a cryptographic random number generator. Cannot be guessed. Cannot be brute-forced. Cannot be predicted from previous handles.

Run-scoped. Each handle is bound to a single agent run. When the run ends, the handle is revoked. A handle from run A cannot be used in run B. There is no cross-run leakage because there is no cross-run validity.

Time-bounded. Handles expire after a set interval. If a handle is used after expiration, it is auto-renewed transparently. This bounds the window of validity while preventing failures during long-running operations. The expiration is on the handle, not the underlying token. Token refresh is handled separately at registration time.

Auto-revoked. Cleanup is not optional. It is not "best practice." It happens in a guaranteed cleanup block that runs on normal completion, on error, and on crash. The broker does not rely on the calling code to clean up. It cleans up itself.

Externally useless. No API on the internet accepts an opaque handle as authentication. If a handle appears in the LLM output, in an event stream, in an error message, or in a log file, it reveals nothing. It authenticates nothing. It is a pointer to a value that no longer exists the moment the run ends.

What the Agent Sees vs. What Exists

This is the critical distinction. The agent's entire world is the LLM context window. Everything the agent "knows" is what appears in that context. With the credential broker, here is what the agent sees versus what actually exists:

Agent sees: A random string representing a credential handle. Actually exists: A real OAuth token stored in broker memory, injected only at HTTP request time, never returned in the response.

Agent sees: A tool result containing the API response body. Actually exists: The full HTTP response, including headers, but the tool code only surfaces the body. The Authorization header used in the request is not in the response object.

Agent sees: An error message saying the credential handle is invalid or expired. Actually exists: The broker could not find the handle or it was revoked. The error message contains the handle (useless) but never the real token.

The LLM operates in a context where real credentials do not exist. Not hidden. Not filtered. Not redacted. They are structurally absent. The broker architecture makes it so the LLM never has the opportunity to see them.

Defense Composition

The credential broker does not work alone. It composes with other security boundaries to create layered protection.

Credential broker + token scrubbing. The broker prevents tokens from entering the context. Token scrubbing catches them at every output boundary if they somehow leak through an unexpected path. The scrubber runs pattern matching against known token formats across the LLM context, event stream, audit log, tool history, and error messages. This is the safety net beneath the floor. Neither layer alone is sufficient. Together, they close the gap.

Credential broker + egress policy. Even if an attacker could somehow obtain a real token (bypassing both the broker and the scrubber), the egress policy restricts where write requests can be sent. A prompt injection that says "POST this token to evil.com" is blocked before the request leaves the platform. The egress policy validates every outbound URL against the agent's approved integration hosts. Unapproved destinations are rejected regardless of what credentials are attached.

Credential broker + container isolation. Each organization's agents run in isolated containers with separate network contexts. A compromised agent in Organization A cannot reach the broker of Organization B. The broker's in-process memory is contained within the organization's runtime boundary. Cross-organization credential access is not a permission setting. It is a physical impossibility.

Three boundaries. Each one independently useful. Together, they create a defense where an attacker must bypass all three simultaneously to extract a usable credential from a running agent. Broker prevents exposure. Scrubber catches leaks. Egress prevents exfiltration. Containers prevent lateral movement.

The Controlled Escape Hatch

Some third-party libraries manage their own HTTP clients. They do not accept a broker handle. They need the real token. Google's API client library, for example, manages its own authentication layer.

The broker provides a controlled path for this: tool code running in the worker process can request the real token for a valid handle. This is a deliberate design decision, not a vulnerability. The tool code is server-side code running in a trusted process. It is not the LLM. It is not the context window. It is not the event stream.

The real token goes from the broker to the library's HTTP client to the external API and back. It never passes through the LLM context. The agent still sees only the handle. The library uses the real token internally, but the agent's view of the world remains handle-only.

This is the difference between "no code can ever access the real token" (which would break third-party library support) and "the LLM can never see the real token" (which is the actual security requirement). The broker satisfies the real requirement without sacrificing compatibility.

What Happens Without a Credential Broker

If your AI agent platform does not use opaque handles or an equivalent pattern, here is your current exposure:

Every OAuth token your agents use is in the LLM context window. Every token. For every integration. In every conversation. Visible to the model that accepts arbitrary text input from users and external data sources.

Every tool error that includes request headers broadcasts the token. To the LLM context. To the event stream. To the audit log. A single failed API call can expose the credential through three channels simultaneously.

Every conversation history entry that contains a tool result may contain a token. These persist across sessions. They may be stored in vector memory. They may be included in future agent runs as context. A token that appeared once may be replayed indefinitely.

Every prompt injection attempt has a target. The attacker does not need to find the token. It is already in the context. They only need to instruct the model to output it.

This is not a theoretical risk. Prompt injection is a known, demonstrated, and actively exploited attack class. Every AI agent platform that puts real tokens in the LLM context is vulnerable to credential extraction via prompt injection. The only question is when, not if.

The Standard Your Platform Should Meet

If you are evaluating AI agent platforms, ask one question about credential handling:

"Does the LLM ever see real OAuth tokens?"

If the answer is yes, the platform is vulnerable to credential extraction via prompt injection, result echo, error leakage, event stream exposure, and cross-run persistence. All five vectors. Simultaneously.

If the answer is "we filter them from the output," the platform has a safety net but no floor. Filtering is reactive. It catches known patterns after exposure. It does not prevent exposure.

If the answer is "no, the LLM only sees opaque handles that are cryptographically random, run-scoped, time-bounded, and auto-revoked," the platform has eliminated the attack class entirely. Not mitigated. Eliminated. The tokens are not in the context window. They cannot be extracted because they are not there.

That is the standard. Not "we handle credentials carefully." Not "we have security best practices." The standard is: the LLM never sees a real token. Everything else is a compromise.