You're building a feature. A user calls your API, your API authenticates them, and then your service needs to call three downstream services to assemble the response. One of those services keeps an audit log — it records which user triggered each operation. So your team wires it up: when your API calls downstream, it forwards the user's token. Simple enough. Except the downstream service starts rejecting requests. The token your API holds was issued for your service, not for the downstream one. It has the wrong audience. It carries scopes the downstream service doesn't understand. And it was never meant to leave your hands.
This is the moment most engineers realise that moving identity through a chain of services is not the same as moving a token. OAuth 2.0 Token Exchange, defined in RFC 8693, is the standard that solves exactly this problem. It gives a service a formal, auditable way to trade one token for another: one scoped to the right audience, carrying the right claims, and preserving the right identity context.
The problem with forwarding tokens
OAuth 2.0 has always been good at one thing: a user grants a client permission to act on their behalf. The user authenticates, the authorization server issues a token, and the client presents that token to a resource server. A single hop.
Modern systems are rarely a single hop. An API gateway sits in front of a business service, which calls an inventory service, which calls a pricing service, which calls a fulfillment engine. Each hop is a service-to-service call. Each downstream service may need to know who the original user was. And each one has different trust requirements, different scopes, and potentially a different authorization server.
Three approaches get used in practice, and all three have problems. Forwarding the user's token directly exposes it to every downstream service, broadens the blast radius if any one of them is compromised, and often fails because the token's aud (audience) claim names only the first service. Minting a service token using client credentials loses the user's identity entirely: the audit log sees a service account, not a person. Issuing the user's token with every possible scope and audience up front violates least privilege at the source.
What you need is a way to say: "I hold a valid token proving this user is authenticated. Please issue me a new token, scoped for this specific downstream service, that still carries the user's identity." That is precisely what Token Exchange provides.
How Token Exchange works
Token Exchange introduces a new OAuth grant type: urn:ietf:params:oauth:grant-type:token-exchange. A service sends a standard POST to the token endpoint, but instead of an authorization code or client credentials, it presents an existing token and asks for a new one.
The request includes a subject token: the token representing the user or identity on whose behalf the exchange is happening. It also carries a subject token type that tells the authorization server what kind of token it is receiving — an access token, an ID token, a SAML assertion, and so on. Optionally, the request specifies the desired audience (the downstream service the new token will be presented to), the desired scope, and the desired token type to be returned.
The authorization server validates the subject token, checks that the requesting client is permitted to perform an exchange, applies its own policy about what the new token should contain, and issues a fresh token. The response looks like a standard OAuth token response: an access_token, a token_type, and an issued_token_type that tells the caller what kind of token it got back.
A minimal exchange request looks like this:
POST /token HTTP/1.1
Host: auth.example.com
Content-Type: application/x-www-form-urlencoded
grant_type=urn%3Aietf%3Aparams%3Aoauth%3Agrant-type%3Atoken-exchange
&subject_token=eyJhbGciOi...
&subject_token_type=urn%3Aietf%3Aparams%3Aoauth%3Atoken-type%3Aaccess-token
&audience=https%3A%2F%2Finventory.example.com
&scope=inventory.read
The authorization server replies with a new access token whose aud is https://inventory.example.com and whose scope is restricted to inventory.read. The client presents this token to the inventory service. The inventory service sees a properly-scoped token for its own audience, validates it normally, and has no idea a token exchange happened.
Impersonation vs. delegation
RFC 8693 draws a sharp distinction between two semantically different operations. Understanding the difference matters a lot when you are designing audit trails and authorization policies.
Impersonation means the newly issued token looks exactly as if the subject issued it directly. The acting service disappears from the token entirely. The downstream service sees only the user. This is appropriate when the user is the relevant identity for every operation in the chain and the intermediate service should be invisible — a simple proxy pattern.
Delegation means the new token carries both the user's identity and a record of the acting service. RFC 8693 defines a JWT claim called act (actor) for exactly this purpose. The act claim is a JSON object containing the identity of the party that is currently acting, while the sub (subject) claim continues to identify the original user.
{
"iss": "https://auth.example.com",
"sub": "user-7829",
"aud": "https://inventory.example.com",
"exp": 1716300000,
"act": {
"sub": "https://api-gateway.example.com"
}
}
This token tells the inventory service two things: the original user is user-7829, and it is the API gateway that is currently acting on their behalf. The authorization policy at the inventory service can check both. The audit log records both. If the chain goes deeper — the inventory service calls a fulfillment service — the act claim can be nested, recording the full delegation chain.
Most security-conscious architectures prefer delegation. If a token with broad impersonation capability is stolen from a service, there is no record of which service was acting. With delegation, each layer is named. The decision between the two sits with the authorization server's policy, and that policy should be deliberate — not a default.
The may_act claim and pre-authorized delegation
One detail in RFC 8693 that often goes unnoticed is the may_act claim. This is a claim placed inside the subject token by the issuer, not the exchanger. It specifies which party — or parties — are permitted to act on behalf of the subject. It is a pre-authorization signal.
When a service presents a subject token that contains a may_act claim, the authorization server can verify that the requesting client is one of the permitted actors before issuing the exchanged token. This prevents arbitrary services from exchanging any token they can get hold of.
{
"sub": "user-7829",
"may_act": {
"sub": "https://api-gateway.example.com"
}
}
This says: only api-gateway.example.com is authorized to act on behalf of this user. Any other service attempting to exchange this token should be refused. Authorization server implementations are not required to enforce may_act — the spec leaves that as a deployment decision — but in any serious production system it should be treated as a hard constraint, not advisory.
Token types and cross-system interoperability
RFC 8693 is not limited to exchanging OAuth access tokens for OAuth access tokens. It defines a set of standard token type URIs that cover a broader range of formats.
The defined types include urn:ietf:params:oauth:token-type:access_token, urn:ietf:params:oauth:token-type:refresh_token, urn:ietf:params:oauth:token-type:id_token, urn:ietf:params:oauth:token-type:saml1, and urn:ietf:params:oauth:token-type:saml2. A jwt type is also defined in RFC 7519.
This means a system can exchange a SAML assertion — common in enterprise identity providers — for a JWT access token. Or an OIDC ID token for a scoped access token. This cross-format exchange is what makes RFC 8693 useful at federation boundaries: where one domain uses SAML and another uses OAuth, Token Exchange is the bridge.
Where you will encounter Token Exchange in practice
The most common real-world pattern is the on-behalf-of flow in microservices. Service A receives a user request with a token. Service A needs to call Service B, and Service B enforces audience restrictions. Service A exchanges the user's token for a new token scoped for Service B, calls Service B with that token, and Service B validates it against its own audience. The user's identity propagates through the chain. Every hop can be audited. No service accumulates broader permissions than it needs for that single operation.
Cloud CI/CD pipelines use a closely related pattern. A pipeline job receives a short-lived OIDC token from the CI platform (GitHub Actions, GitLab, and others all issue these). The job exchanges that token for a cloud provider credential — an AWS role assumption, a GCP service account token — scoped only to what that specific job needs. The credential is never stored; it exists only for the duration of the job. This is what is usually called workload identity federation, and RFC 8693 is the mechanism underneath it.
API gateways are another natural fit. The gateway is the single entry point for external traffic. It authenticates the inbound token, performs an exchange to obtain a downstream-scoped token, and forwards the new token to the internal service. Internal services never see external tokens, and external tokens never need to carry every possible audience and scope.
Enterprise federation scenarios — where a user authenticates against a corporate SAML identity provider and needs to access a cloud-native OAuth API — are exactly the cross-format interoperability case described in the previous section. Token Exchange is the bridge that avoids forcing the enterprise to replace its identity infrastructure.
Most recently, MCP (Model Context Protocol) server architectures are starting to adopt RFC 8693 for user delegation. An AI agent authenticates as a user against a gateway, and when the gateway fans out to upstream tool servers, it needs to exchange the user's inbound token for a token scoped to each tool server — preserving the user's identity through the call chain rather than substituting a shared service credential.
What the spec leaves open
RFC 8693 is deliberately flexible, and that flexibility is a double-edged quality. The spec defines the wire format and the semantics. It does not define which clients are allowed to request exchanges, which token combinations are valid, how deeply nested delegation chains should be trusted, or how the authorization server should evaluate exchange requests against its policy.
Every production deployment must answer these questions explicitly. An authorization server that accepts any valid token from any authenticated client for exchange is not secure — it widens the attack surface considerably. At minimum, production deployments should require client authentication on exchange requests, validate the may_act claim when present, restrict audience values to a known set, and apply scope reduction (never allow an exchange to widen scope beyond what the subject token already grants).
Authorization servers that support RFC 8693 — Keycloak, Authlete, PingAM, ZITADEL, and others — each make different default choices about these policies. Reading the authorization server's own documentation on token exchange is not optional; the RFC gives you the grammar, not the safety rules.
Summary
OAuth 2.0 was designed for a single hop: a user, a client, and a resource server. Real systems have many hops. Token Exchange is the mechanism that makes multi-hop identity propagation safe and auditable, without forcing every downstream service to trust every upstream token.
The core insight is that a token is a scoped, audience-bound credential. When identity needs to move to a different scope or a different audience, the right answer is to issue a new credential — not to forward the old one or drop identity altogether. RFC 8693 formalizes that operation, adds the delegation semantics to preserve the full call chain, and supports cross-format exchange so it works at federation boundaries too.
Impersonation makes the acting service invisible. Delegation records it in the act claim. In any system where auditability matters — which is most systems that handle user data — delegation is the right default. The may_act claim lets the original issuer constrain who can act on behalf of a subject, closing the largest hole the spec's flexibility opens.
Token Exchange is not a niche corner of OAuth. It is the foundation of workload identity, the engine behind API gateway fanout, and the bridge between enterprise SAML and cloud-native OAuth. If you are building or operating distributed systems today, you are probably relying on it already — and understanding it explicitly makes you a better architect of the systems around it.
Part of the Explained series — concepts in tech, clearly.