How do I use a custom OpenAI-compatible model in LlamaIndex?

Use the OpenAILike LLM with model set to codeburst-agent, api_base set to https://codeburst.ai/api/v1, your CodeBurst key, and is_chat_model and is_function_calling_model set to True. Then set it as Settings.llm or pass it to your query engine or agent.

Why does a LlamaIndex query need failover?

A single question can trigger retrieval, sub-question routing, synthesis and tool calls — many model calls. Any one can hit a provider rate limit or return an empty tool turn. CodeBurst fails over across providers and repairs empty tool turns so the answer completes.

Do I set is_chat_model and is_function_calling_model?

Yes. Set both to True so OpenAILike uses the chat endpoint and enables tool calling against codeburst-agent.

Guide · Updated June 2026

The best LLM for LlamaIndex

In LlamaIndex, a single user question rarely means a single model call. A query engine retrieves, maybe routes sub-questions, synthesizes across nodes, and — if it's an agent — calls tools along the way. That's a lot of calls per answer, and any one hitting a provider rate limit or returning an empty tool turn leaves the user with half an answer. CodeBurst gives LlamaIndex a model that fails over across providers and repairs empty tool turns, so the query completes.

RAG and data agents are call-heavy

Retrieval-augmented and agentic pipelines amplify call count: retrieve, re-rank, synthesize, and for agents, plan and call tools over your data. The richer the pipeline, the more exposed each answer is to a single provider's rate limit or an empty tool-synthesis turn that quietly stalls synthesis.

What CodeBurst adds

Failure	CodeBurst
A synthesis/agent call rate-limits	Reroutes to a healthy provider in the same request; the answer continues.
Empty tool-synthesis turn in a data agent	`codeburst-agent` retries with a corrected format.
A complex question needs more rigor	Use `codeburst-swarm` for a multi-model vote on the synthesis.

Configure the model

LlamaIndex calls any OpenAI-compatible endpoint via OpenAILike:

from llama_index.llms.openai_like import OpenAILike
from llama_index.core import Settings

llm = OpenAILike(
    model="codeburst-agent",
    api_base="https://codeburst.ai/api/v1",
    api_key="YOUR_CODEBURST_KEY",
    is_chat_model=True,
    is_function_calling_model=True,
)

Settings.llm = llm   # or pass `llm=` to a query engine / agent

Your indexes, retrievers and query engines are unchanged — the model just stops depending on one provider being up for the whole pipeline.

Get started

Get an API key Best LLM for AI agents →

FAQ

How do I use a custom model in LlamaIndex?
OpenAILike(model="codeburst-agent", api_base="https://codeburst.ai/api/v1", api_key=..., is_chat_model=True, is_function_calling_model=True).

Why does a query need failover?
One question = many calls; CodeBurst fails over and repairs tool turns so the answer completes.

Set the two flags?
Yes — is_chat_model and is_function_calling_model both True.