by Theo Park13 min readMigration, OpenRouter, OminiGate, Rollout, LLM Gateway

Migrating from OpenRouter to OminiGate: a practical step-by-step

Switching providers without breaking production. Endpoints, model slug mapping, headers, streaming, and rollout strategy when moving traffic from OpenRouter — every fact verified against current docs.

OpenRouter and OminiGate both expose an OpenAI-compatible surface, so moving traffic between them is mostly mechanical — change the base URL, swap the key, adjust a few headers, and verify model slugs. The mistakes happen in the details: a slug that looks plausible but was never listed, an attribution header you forgot to remove, a retired Anthropic model your stack still references. This guide walks through each step with current, source-cited facts as of 2026-04-28.

Before you start, two assumptions: your application talks to OpenRouter through the OpenAI-compatible /v1/chat/completions endpoint (the most common case), and you want a low-risk cutover that you can roll back at the load balancer. If you use the Anthropic-compatible /v1/messages endpoint, OminiGate exposes the same shape at https://api.ominigate.ai; everything else in this guide still applies.

Why teams consider migrating

OpenRouter is a competent gateway and a fine default for many teams. The reasons engineers tell us they evaluate alternatives are pragmatic, not religious:

  • Fee structure. OpenRouter charges 5.5% ($0.80 minimum) on credit purchases for non-crypto payments. Token usage itself is pass-through with no per-call markup (source). For BYOK requests beyond the first 1M/month they currently apply a 5% usage fee that is scheduled to be replaced by a fixed monthly subscription whose price has not yet been announced (source). Teams modeling 12-month spend dislike that uncertainty.
  • Attribution surface. OpenRouter encourages you to send HTTP-Referer and X-OpenRouter-Title so your app appears on its public rankings (docs). Some teams want their internal app names off a public leaderboard.
  • Routing surprises. When a model is not directly listed, OpenRouter may route to a fallback or return not available. As of writing, google/gemini-3-pro and google/gemini-3.1-proboth return “not available” and direct users to request the model in Discord. The actually-listed slug is google/gemini-3.1-pro-preview.
  • Single-vendor consolidation. Some teams already centralize on OminiGate for image and video models (81 image, 80 video as of 2026-04-28) and want chat, image, and video billing on one ledger.

None of these are a knock on OpenRouter. They are the situations where a one-day migration is worth the engineer-time. If those don’t apply to you, stay put.

Step 1: Switch the base URL

OpenRouter’s OpenAI-compatible base URL is https://openrouter.ai/api/v1 (quickstart docs). OminiGate’s OpenAI-compatible base URL is https://api.ominigate.ai/v1. The Anthropic-compatible surface lives at https://api.ominigate.ai.

If you use the official OpenAI SDK, the change is one line:

client.pypython
# Before
from openai import OpenAI
client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key=OPENROUTER_API_KEY,
)

# After
client = OpenAI(
    base_url="https://api.ominigate.ai/v1",
    api_key=OMINIGATE_API_KEY,
)

Same for Node:

client.tstypescript
import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.ominigate.ai/v1",
  apiKey: process.env.OMINIGATE_API_KEY,
});

Resist the urge to overload one client object with both providers through a feature flag at the SDK level. Keep two clients during rollout and route at the call site — it makes the rollback line a single config change instead of an SDK reconfiguration.

Step 2: Replace the API key

OminiGate keys are prefixed sk-omg-(visible in OminiGate’s docs and console). Generate a new key from the dashboard rather than reusing your OpenRouter key — the formats are not interchangeable, and you want a clean cost line on day one.

Two operational notes:

  • Add the key as a new environment variable (e.g. OMINIGATE_API_KEY) instead of overwriting OPENROUTER_API_KEY. During rollout you will want both.
  • If you operate per-tenant keys, mint OminiGate keys 1:1 against the existing OpenRouter keys so audit logs line up.

Step 3: Trim the headers

OpenRouter’s app-attribution headers are HTTP-Referer (requiredfor app attribution; without it “no app page will be created and your usage will not appear in rankings”) and X-OpenRouter-Title (optional; X-Title is still accepted for backwards compatibility) (App Attribution docs). There is also X-OpenRouter-Categories for marketplace category tagging.

OminiGate does not have leaderboards, so none of these headers do anything on the OminiGate side. Strip them when calling OminiGate — sending unrecognized headers is harmless, but leaving them in dual-routed code makes diffs noisier than they need to be.

request-headers.diffdiff
  headers: {
    Authorization: `Bearer ${apiKey}`,
    "Content-Type": "application/json",
-   "HTTP-Referer": "https://your-app.example.com",
-   "X-OpenRouter-Title": "Your App",
  }

If your dual-routing layer hits both providers behind a feature flag, keep the headers behind a per-provider config block instead of stripping them globally — OpenRouter still wants them for as long as you send a single token to that path.

Step 4: Map model slugs

This is where most migrations break, so this section is exhaustive. Every row below was verified against either the upstream provider’s official deprecation page or the live model page on each gateway as of 2026-04-28.

Anthropic

Two important facts: (1) Anthropic deprecated claude-sonnet-4-20250514 and claude-opus-4-20250514 on 2026-04-14 with retirement scheduled for 2026-06-15; (2) Anthropic’s recommended replacements are claude-sonnet-4-6 and claude-opus-4-7 respectively (Anthropic deprecation table). If your code still pins Claude Sonnet 4 or Opus 4 base versions, map to the 4.6 / 4.7 line during this migration; you would have had to do it within 60 days anyway.

anthropic-slug-maptext
OpenRouter slug                         OminiGate slug
─────────────────────────────────────   ─────────────────────────────────────
anthropic/claude-opus-4.7            →  anthropic/claude-opus-4.7
anthropic/claude-opus-4.6            →  anthropic/claude-opus-4.6
anthropic/claude-opus-4.5            →  anthropic/claude-opus-4.5
anthropic/claude-sonnet-4.6          →  anthropic/claude-sonnet-4.6
anthropic/claude-sonnet-4.5          →  anthropic/claude-sonnet-4.5
anthropic/claude-haiku-4.5           →  anthropic/claude-haiku-4.5
anthropic/claude-3.7-sonnet  (RETIRED) → anthropic/claude-sonnet-4.6
anthropic/claude-3.5-sonnet  (legacy)  → anthropic/claude-sonnet-4.6
anthropic/claude-3.5-haiku   (RETIRED) → anthropic/claude-haiku-4.5

Sources: OminiGate model list (verified 2026-04-28); OpenRouter product pages for claude-opus-4.7 and claude-sonnet-4.6; Anthropic’s deprecation table cited above for retirement status.

One subtlety: anthropic/claude-3.5-sonnet still appears on OpenRouter, but the underlying weights claude-3-5-sonnet-20241022were retired by Anthropic on 2025-10-28. OpenRouter’s router papers over this, but new code should target claude-sonnet-4.6.

OpenAI

OpenAI shipped GPT-5.5 on 2026-04-23, with API availability the next day (launch post). It is the current frontier model in the GPT-5 family. GPT-5.4 is still active and frequently used as the latency/cost workhorse.

openai-slug-maptext
OpenRouter slug                  OminiGate slug
──────────────────────────────   ──────────────────────────────
openai/gpt-5.5                →  openai/gpt-5.5
openai/gpt-5.5-pro            →  openai/gpt-5.5-pro
openai/gpt-5.4                →  openai/gpt-5.4
openai/gpt-5.4-mini           →  openai/gpt-5.4-mini
openai/gpt-5.4-nano           →  openai/gpt-5.4-nano
openai/gpt-5.4-pro            →  openai/gpt-5.4-pro
openai/gpt-5                  →  openai/gpt-5
openai/gpt-5-pro              →  openai/gpt-5-pro
openai/gpt-5-mini             →  openai/gpt-5-mini
openai/o3                     →  openai/o3
openai/o4-mini                →  openai/o4-mini

Sources: OpenRouter pages for openai/gpt-5.5 and openai/gpt-5; OminiGate model list (verified 2026-04-28).

Google

This is the most error-prone family. Gemini 3.1 Pro shipped on 2026-02-19 (model card), but on the gateways it is exposed under a -preview suffix. The non-preview slugs do not exist: google/gemini-3.1-proon OpenRouter returns “not available”.

google-slug-maptext
OpenRouter slug                              OminiGate slug
──────────────────────────────────────────   ──────────────────────────────────────────
google/gemini-3.1-pro-preview            →   google/gemini-3.1-pro-preview
google/gemini-3.1-pro-preview-customtools →  google/gemini-3.1-pro-preview-customtools
google/gemini-3.1-flash-lite-preview      →  google/gemini-3.1-flash-lite-preview
google/gemini-3.1-flash-image-preview     →  google/gemini-3.1-flash-image-preview

Sources: OpenRouter listings under google/gemini-3.1-pro-preview; OminiGate model list (verified 2026-04-28).

If your code currently uses google/gemini-2.5-pro, update both the OpenRouter side and the OminiGate side to the 3.1 preview before cutover — do not introduce a version bump and a provider change in the same deploy.

DeepSeek

DeepSeek released DeepSeek-V4 in two preview variants on 2026-04-24 (DeepSeek changelog). The legacy deepseek-chat and deepseek-reasoner aliases will be retired on 2026-07-24, 15:59 UTC, so this mapping is time-sensitive.

deepseek-slug-maptext
OpenRouter slug                  OminiGate slug
──────────────────────────────   ──────────────────────────────
deepseek/deepseek-v4-pro      →  deepseek/deepseek-v4-pro
deepseek/deepseek-v4-flash    →  deepseek/deepseek-v4-flash
deepseek/deepseek-v3.2        →  deepseek/deepseek-v3.2
deepseek/deepseek-v3.2-exp    →  deepseek/deepseek-v3.2-exp
deepseek/deepseek-chat        →  deepseek/deepseek-v4-flash  (chat retiring 2026-07-24)

Sources: OpenRouter page deepseek/deepseek-v4-pro; OminiGate model list (verified 2026-04-28).

Meta Llama

meta-slug-maptext
OpenRouter slug                                OminiGate slug
────────────────────────────────────────────   ────────────────────────────────────────────
meta-llama/llama-4-maverick                →   meta-llama/llama-4-maverick
meta-llama/llama-4-scout                   →   meta-llama/llama-4-scout
meta-llama/llama-3.3-70b-instruct          →   meta-llama/llama-3.3-70b-instruct
meta-llama/llama-3.1-70b-instruct          →   meta-llama/llama-3.1-70b-instruct

Source: OminiGate model list (verified 2026-04-28); Llama 4 release background from Meta’s Llama 4 announcement.

Reality check: do not let an LLM “helpfully” rewrite your slug list. Ship the literal strings above, then run a smoke test that asserts the model name returned in each response equals the slug you sent.

Step 5: Verify streaming behavior

Both gateways implement the OpenAI Server-Sent Events protocol: chunks are data: {...} lines terminated by data: [DONE]. The OpenAI SDK’s built-in stream iterator handles both endpoints without modification. Two practical gotchas:

  • Final usage block. If your billing or analytics code relies on the usage object that comes in the final chunk, send stream_options: { include_usage: true } on both providers (this is OpenAI standard). Without it, the final chunk has no token counts.
  • Provider passthrough fields. OpenRouter sometimes surfaces upstream-specific fields (e.g. provider, cache_read_input_tokens for Anthropic). OminiGate surfaces the standard OpenAI fields plus Anthropic cache_creation_input_tokens / cache_read_input_tokens when calling Anthropic-family models. If you persist response payloads, account for slight schema drift in non-billing fields.

Run the same prompt at temperature 0 against both providers, diff the streamed text and the token counts, and gate the cutover on byte-equal output for a small canonical suite.

Step 6: Reconcile spend and observability

The number that matters is per-1k-token cost, but the line item where you see it differs. OpenRouter shows credit deductions in the dashboard, with token usage pass-through and a 5.5% (non-crypto) surcharge applied at credit purchase. OminiGate is pay-per-token with no per-call markup and no top-up surcharge (OminiGate pricing).

Concrete example: Claude Sonnet 4.6 on OpenRouter is $3/M input and $15/M output (source). On OminiGate the per-token rates are the same passthrough rates; the spend difference shows up at the top-up tier. If you load $1,000 of credits onto OpenRouter via Stripe, you actually pay $1,055 (5.5%); OminiGate bills your credit card directly for what you used.

For observability, both providers return the standard OpenAI fields: usage.prompt_tokens, usage.completion_tokens, usage.total_tokens. If you log response.idfor traceability, note that the prefix is provider-specific — do not assume a fixed format.

Rollout strategy

A safe cutover for a non-trivial production system looks like this:

  1. Day 0 — Dual-write, single-read. Add a feature flag llm_provider with values openrouter | ominigate. Default to openrouter. Add structured logs with provider, model slug, latency, token counts.
  2. Day 1 — Shadow traffic. For 1-5% of requests, fire the same prompt to both providers in parallel, return the OpenRouter response, log the OminiGate response and a diff. Run for 24-72 hours. Investigate any case where token counts differ by more than a small floor (a few tokens of tokenizer drift is normal, double counts are not).
  3. Day 3 — 10% canary. Flip 10% of users to OminiGate. Watch error rate, p95 latency, and cost per call.
  4. Day 5-7 — Ramp 25 / 50 / 100%. Each step waits for two business-hour windows of clean metrics.
  5. Day 14 — Decommission. Once 100% has been stable for a week, remove the OpenRouter code path and rotate the OpenRouter key.

Have a one-button rollback: flip llm_provider back to openrouterat the config layer, no redeploy. If you can’t roll back in under 60 seconds, you haven’t finished Step 1.

Pre-cutover checklist

  • Slug audit. Grep your repo for every model string that hits the gateway. Check each against the mappings in Step 4. Pay special attention to anything containing claude-sonnet-4-2025, claude-opus-4-2025, claude-3-5-sonnet, or gemini-3.1-pro without -preview.
  • Header audit. Confirm HTTP-Referer, X-OpenRouter-Title, and X-Title are scoped to the OpenRouter call site only.
  • Streaming smoke test. Run the same temperature-0 prompt at both providers, byte-diff the response.
  • Tool / function calling. If you use OpenAI tool calling, run one tool-calling fixture per model family.
  • Anthropic cache headers. If you depend on prompt caching, verify cache_creation_input_tokens and cache_read_input_tokens appear in usage.
  • Rate limit headers. Both providers return rate limit information. Diff your assumed schema against actual response headers.
  • Billing reconciliation script. A 5-line script that sums logged total_tokens per day and matches it against provider invoices, both during and after the migration window.
  • Rollback flag tested. Toggle llm_provider back and forth in staging. Confirm provider-specific code paths handle re-route mid-session.

Sources

All facts in this guide are traceable to the URLs below, all verified 2026-04-28.

Frequently asked questions

Can I keep both OpenRouter and OminiGate live in the same app during rollout?

Yes, that’s the recommended path. Use a feature flag at the call site so you can flip individual percentages of traffic between providers without redeploying. Both expose OpenAI-compatible endpoints, so the only per-provider divergence in the request layer is the base URL, the API key, and the OpenRouter attribution headers (HTTP-Referer / X-OpenRouter-Title) which you should scope to the OpenRouter call path only.

What happens if my code still references claude-sonnet-4 or claude-opus-4?

Both base versions were officially deprecated by Anthropic on 2026-04-14 with a retirement date of 2026-06-15 (source: platform.claude.com/docs/en/docs/about-claude/model-deprecations). The recommended replacements are claude-sonnet-4-6 and claude-opus-4-7 respectively. On both OpenRouter and OminiGate the slugs are anthropic/claude-sonnet-4.6 and anthropic/claude-opus-4.7. Migrate to those before mid-June regardless of which gateway you’re on.

Why does google/gemini-3.1-pro fail on OpenRouter?

OpenRouter only lists the preview slug as of 2026-04-28 — google/gemini-3.1-pro-preview. Hitting google/gemini-3.1-pro returns ‘not available’ and a Discord link to request the model. OminiGate uses the same -preview suffix, so the slugs map 1:1 once you switch to the preview form.

Does OminiGate have a per-token markup like the OpenRouter top-up fee?

No. OminiGate is pay-per-token with no subscription, no minimum, and no top-up surcharge (source: ominigate.ai/en/pricing). OpenRouter’s 5.5% non-crypto fee is a payment-processing platform fee on credit purchases, not a per-token markup; OpenRouter passes through provider token pricing without markup. The per-1k-token unit price is the same across both gateways for any given upstream model.

Will my streaming code need changes?

Almost never. Both gateways implement the OpenAI SSE protocol with data: chunks ending in data: [DONE]. The OpenAI SDK’s built-in stream iterator works against both. If you need the final usage object, send stream_options: { include_usage: true } — that’s OpenAI standard and both providers honor it. Watch out for non-billing fields (provider, cache_*) where schemas differ slightly.

Try every model behind one API key

Sign up in seconds, top up once, and call 400+ text, image, and video models with the OpenAI and Anthropic SDKs you already use.