// one OpenAI-compatible endpoint · many providers · automatic failover

One API for every model.
It routes itself.

Point your OpenAI SDK at one endpoint. CodeBurst sends each request to the best available model across NVIDIA, Mistral, Google, Groq, Cohere and more — and fails over the instant a provider degrades. Your app never sees the outage.

base_url = "https://codeburst.ai/api/v1"
10+
providers
20+
models & aliases
1
endpoint
auto
failover
Why CodeBurst

A router that heals itself.

Most gateways just proxy your request to one provider. When that provider rate-limits or goes down, you get the error. CodeBurst is built around the assumption that providers fail — and routes around it for you.

🔁

Automatic failover

Every model is a fallback chain, not a single endpoint. The instant a provider rate-limits or degrades, CodeBurst reroutes mid-request to the next healthy lane. Nightly health probes keep the chains fresh, so a dead provider is dropped before it ever reaches you.

🎯

Task-aware routing

Don't pick a model — pick a job. codeburst-agent for tool-calling, codeburst-vision for images, codeburst-compress for context, codeburst-swarm for hard reasoning. Each alias routes to the model that's measurably best at that task.

🔌

Drop-in OpenAI API

Standard /v1/chat/completions. Change your base_url and key — keep your existing OpenAI SDK, your code, your tooling. Swap models by changing one string, never your integration.

💸

No markup

Bring your own provider keys and pay the provider directly — CodeBurst adds zero per-token markup. Or hand us the keys and let managed routing pick the best model per request. Your bill, your control.

🧩

Provider breadth

NVIDIA NIM, Mistral, Google Gemini, Groq, Cohere, SambaNova, Cerebras, Cloudflare and more — behind one endpoint. New providers slot into the routing layer without a single change to your app.

🛡️

Reliability patterns built in

Beyond failover: multi-model voting, debate, and size-aware context routing ship as first-class aliases. Reach for them with a model name — no orchestration code on your side.

Drop-in

Two lines to switch. Zero to your code.

If your app already speaks the OpenAI API, you're done in under a minute.

 python
from openai import OpenAI

client = OpenAI(
    base_url="https://codeburst.ai/api/v1",   # ← the only change
    api_key="YOUR_CODEBURST_KEY",
)

resp = client.chat.completions.create(
    model="codeburst-best",      # or codeburst-agent / codeburst-vision / codeburst-swarm
    messages=[{"role": "user", "content": "Hello"}],
)
print(resp.choices[0].message.content)
Coming from OpenRouter?

Same pattern. More resilience.

CodeBurst speaks the exact same OpenAI-compatible interface — so migrating is a base_url change. What you gain is routing that actively works around provider failures instead of passing them through.

CodeBurst

  • Every model is a multi-provider fallback chain
  • Mid-request reroute when a provider degrades
  • Nightly health probes prune dead lanes automatically
  • Task-aware aliases (agent · vision · compress · swarm)
  • Built-in voting / debate / size-aware context routing
  • BYOK with no markup, or managed routing

Typical router

  • Routes to one provider per request
  • Provider error is returned to your app
  • Manual model selection per call
  • You build orchestration / retries yourself
  • Markup on managed credits
  • Switching models is on you
Models

Popular models, one endpoint.

Call any model by name, or use a smart alias and let CodeBurst route to the best one for the job — with failover already wired in.

Nemotron Super 120B Qwen3.5 397B DeepSeek V4 GPT-OSS 120B Mistral Large Mistral Small Llama 3.3 70B Llama 4 Scout Gemini 2.5 Flash Gemini 2.5 Pro Command R Kimi K2 GLM 5.1
Smart aliases — route by job, not by model
Pick the task; CodeBurst picks (and fails over between) the models that are measurably best at it.
codeburst-bestTop-quality reasoning — the strongest model currently up
codeburst-agentTool-calling & multi-step agents, with tool-format repair built in
codeburst-visionImages — receipts, screenshots, photos (gpt-4o → gemini → scout)
codeburst-swarmHard problems — multiple models vote for a more reliable answer
codeburst-compressSize-aware context compaction — never overflows a model's window
codeburst-fastLow-latency lane for quick, cheap turns
codeburst-1m / 2mMillion-token+ context for long documents
codeburst-128k / 256kPick a context tier explicitly
One endpoint, many backends

Routing across the providers that matter.

Each request lands on whichever of these is healthiest and best-fit for the job.

NVIDIA NIM Mistral Google Gemini Groq Cohere SambaNova Cerebras Cloudflare + more

Stop building retry logic. Start shipping.

One OpenAI-compatible endpoint that routes to the best model and survives provider outages on its own.