The hard part of multi-model infrastructure is everything between the request and the response. This is what WayJet engineers on every call — so your code stays one clean integration while the layer earns its keep.
On every call
The work between request and response
Scoring routes
Route A
Route B
Route C
Health, latency and price scored per call — the best route wins.
Scored routing, not a fixed route
Each call is scored on live provider health, latency and price, then sent down the best route — load-balanced, latency-aware, cost-aware or rule-based. Pin a provider, prefer the cheapest member, or route by header; the policy is config, not a redeploy.
Provider down? Traffic reroutes automatically.
Stays up when a provider does not
Unhealthy upstreams are detected and circuit-broken; calls retry with backoff and fail over to a healthy provider for the same model. A BYOK leg can fall back to the pool. Your app keeps responding through an outage instead of inheriting it.
claude-opus-4.7— ok
gpt-5.1— ok
gemini-3-pro— ok
One dashboard for every model, every call.
Every call, fully observable
Latency, status and spend for every model in one place — broken down per request into routing, upstream and cache segments, so nothing about a call is a black box you have to guess at.
Control, without the maintenance
The rest of the layer
Model groups
Define a virtual model that resolves to the best member by cost, priority or weight. Swap the selection policy without touching a line of your code.
Bring your own keys
Route through your own provider accounts when you want to — keep committed-spend discounts, let WayJet do the orchestration, billing only its service fee.
Spend & rate controls
Per-key RPM, TPM, concurrency and daily-spend ceilings, plus organisation budgets — governance built into the layer, enforced before the upstream call, not bolted on after.
Response caching
Exact and semantic caching with per-key switches and hit observability — repeat work is served from cache, so you pay upstream cost once, not every time.
Precise metering
Token cost computed from the upstream’s own usage at catalog prices, in decimal — fail-closed when a call can’t be priced exactly. OpenRouter-grade accuracy, by design.
Unified usage
One source of truth for spend and volume across every model and key — query by period, model or key, with the prepaid balance that never expires.
Build on the layer, not on one vendor
One API key into every model — routing, failover, observability and controls included.