Guide · Updated June 2026

The best LLM for LlamaIndex

In LlamaIndex, a single user question rarely means a single model call. A query engine retrieves, maybe routes sub-questions, synthesizes across nodes, and — if it's an agent — calls tools along the way. That's a lot of calls per answer, and any one hitting a provider rate limit or returning an empty tool turn leaves the user with half an answer. CodeBurst gives LlamaIndex a model that fails over across providers and repairs empty tool turns, so the query completes.

RAG and data agents are call-heavy

Retrieval-augmented and agentic pipelines amplify call count: retrieve, re-rank, synthesize, and for agents, plan and call tools over your data. The richer the pipeline, the more exposed each answer is to a single provider's rate limit or an empty tool-synthesis turn that quietly stalls synthesis.

What CodeBurst adds

FailureCodeBurst
A synthesis/agent call rate-limitsReroutes to a healthy provider in the same request; the answer continues.
Empty tool-synthesis turn in a data agentcodeburst-agent retries with a corrected format.
A complex question needs more rigorUse codeburst-swarm for a multi-model vote on the synthesis.

Configure the model

LlamaIndex calls any OpenAI-compatible endpoint via OpenAILike:

from llama_index.llms.openai_like import OpenAILike
from llama_index.core import Settings

llm = OpenAILike(
    model="codeburst-agent",
    api_base="https://codeburst.ai/api/v1",
    api_key="YOUR_CODEBURST_KEY",
    is_chat_model=True,
    is_function_calling_model=True,
)

Settings.llm = llm   # or pass `llm=` to a query engine / agent

Your indexes, retrievers and query engines are unchanged — the model just stops depending on one provider being up for the whole pipeline.

Get started

Get an API key Best LLM for AI agents →

FAQ

How do I use a custom model in LlamaIndex?
OpenAILike(model="codeburst-agent", api_base="https://codeburst.ai/api/v1", api_key=..., is_chat_model=True, is_function_calling_model=True).

Why does a query need failover?
One question = many calls; CodeBurst fails over and repairs tool turns so the answer completes.

Set the two flags?
Yes — is_chat_model and is_function_calling_model both True.