What it does

The gateway sits between every AI client in your stack and the upstream model provider (OpenAI, Anthropic, Bedrock, vLLM, your private endpoint). It is wire-compatible with each provider's API; you change the base_url on your existing client and ship.

For each request, the gateway:

Resolves identity from the API key, JWT, or upstream SSO session
Evaluates policy at every phase (ingress, identity, sovereignty, payload-scan, dispatch, response, evidence-emit)
Optionally invokes RelayGate ContextWorkers to rewrite the payload (PII redaction, credential injection, header shaping)
Dispatches to the resolved upstream provider, which may be a public model, an internal R1 agent, or a sovereign endpoint
Signs a flow receipt and emits to the evidence chain

Why drop-in

Every other governance product asks you to install an SDK or write a forwarding wrapper. We explicitly engineered the gateway to require neither. The integration cost for a typical organization is one configuration push; rollback is one configuration push.

Latency budget

Median policy evaluation is under 5 ms. End-to-end gateway hop adds ~12 ms at p50, ~35 ms at p99 versus direct provider call from the same region. The hot path is engineered with no synchronous external calls outside the policy and identity caches.

Provider compatibility

OpenAI — chat, completions, embeddings, assistants v1
Anthropic — messages, tool use, prompt caching
AWS Bedrock — converse, invoke, all hosted model families
Google Vertex AI — generateContent, streamGenerateContent
vLLM and other OpenAI-compatible self-hosted endpoints
Custom upstreams via the gateway plugin SDK

Deployment

SaaS — managed gateway at gateway.relayone.ai; instant; included in Free / Pro
BYOC — deploy in your AWS / GCP / Azure account; we manage; you control the data plane
On-prem — single binary plus PostgreSQL; air-gapped supported

Adjacent reading

Quickstart — five-minute setup
Concepts — how the gateway fits with policy, evidence, and fleet
API reference — provider-compatible endpoints
On-prem posture — the no-cloud integrity story