What it does

The gateway sits between every AI client in your stack and the upstream model provider (OpenAI, Anthropic, Bedrock, vLLM, your private endpoint). It is wire-compatible with each provider's API; you change the base_url on your existing client and ship.

For each request, the gateway:

  • Resolves identity from the API key, JWT, or upstream SSO session
  • Evaluates policy at every phase (ingress, identity, sovereignty, payload-scan, dispatch, response, evidence-emit)
  • Optionally invokes RelayGate ContextWorkers to rewrite the payload (PII redaction, credential injection, header shaping)
  • Dispatches to the resolved upstream provider, which may be a public model, an internal R1 agent, or a sovereign endpoint
  • Signs a flow receipt and emits to the evidence chain

Why drop-in

Every other governance product asks you to install an SDK or write a forwarding wrapper. We explicitly engineered the gateway to require neither. The integration cost for a typical organization is one configuration push; rollback is one configuration push.

Latency budget

Median policy evaluation is under 5 ms. End-to-end gateway hop adds ~12 ms at p50, ~35 ms at p99 versus direct provider call from the same region. The hot path is engineered with no synchronous external calls outside the policy and identity caches.

Provider compatibility

  • OpenAI — chat, completions, embeddings, assistants v1
  • Anthropic — messages, tool use, prompt caching
  • AWS Bedrock — converse, invoke, all hosted model families
  • Google Vertex AI — generateContent, streamGenerateContent
  • vLLM and other OpenAI-compatible self-hosted endpoints
  • Custom upstreams via the gateway plugin SDK

Deployment

  • SaaS — managed gateway at gateway.relayone.ai; instant; included in Free / Pro
  • BYOC — deploy in your AWS / GCP / Azure account; we manage; you control the data plane
  • On-prem — single binary plus PostgreSQL; air-gapped supported

Adjacent reading