Skip to content

docs: BYO OpenAI-compatible gateway (Bifrost/LiteLLM/vLLM) + required vLLM tool-calling flags#2003

Open
amit-chaubey wants to merge 2 commits into
kagent-dev:mainfrom
amit-chaubey:docs/byo-openai-gateway-vllm
Open

docs: BYO OpenAI-compatible gateway (Bifrost/LiteLLM/vLLM) + required vLLM tool-calling flags#2003
amit-chaubey wants to merge 2 commits into
kagent-dev:mainfrom
amit-chaubey:docs/byo-openai-gateway-vllm

Conversation

@amit-chaubey

Copy link
Copy Markdown

Summary

The BYO OpenAI-compatible guide covers hosted providers (Cohere) but not the common
self-hosted pattern: kagent → OpenAI-compatible gateway (Bifrost, LiteLLM) → vLLM.
Two things bite users that aren't documented anywhere:

  1. vLLM must be launched with --enable-auto-tool-choice --tool-call-parser <parser>.
    kagent's declarative runtime unconditionally registers the built-in ask_user tool
    (python/packages/kagent-adk/src/kagent/adk/types.py), so every chat completion
    request carries a non-empty tools array with tool_choice: "auto" — even for agents
    with zero user-configured tools. Without the flags, every agent turn fails with a
    generic provider API error (status 400) that gives no hint of the cause.

  2. ModelConfig.spec.model must be the gateway's routing identifier (often
    provider-prefixed, e.g. vllm/Qwen/...), which differs from the bare HF name
    vLLM serves under.

This PR adds:

  • docs/user-guide/openai-compatible-gateway-vllm.md — end-to-end guide with a
    troubleshooting section mapping the 400 to the missing parser flags
  • docs/user-guide/README.md — user-guide index
  • examples/modelconfig-openai-gateway-vllm.yaml — copy-paste Secret + ModelConfig +
    Agent manifest (validated against the v1alpha2 types on main)

Verified against a live setup: Bifrost gateway fronting vLLM serving
Qwen/Qwen2.5-7B-Instruct with --tool-call-parser hermes.

Test plan

  • Doc steps reproduce against a live gateway + vLLM setup
  • Example YAML applies cleanly against current v1alpha2 CRDs
  • Links resolve (kagent.dev BYO page, vLLM tool-calling docs, issue [FEATURE] vllm modelconfig #959)

Signed-off-by: AmitChaubey <amit.katyayana@gmail.com>
Copilot AI review requested due to automatic review settings June 12, 2026 15:53
@github-actions github-actions Bot added the documentation Improvements or additions to documentation label Jun 12, 2026

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds a new user-guide flow for running kagent through an OpenAI-compatible gateway (e.g., Bifrost/LiteLLM) to a self-hosted vLLM backend, including copy/paste manifests.

Changes:

  • Added an end-to-end guide documenting gateway ↔ vLLM configuration and troubleshooting.
  • Added an example Kubernetes manifest for a gateway-backed ModelConfig and an Agent.
  • Added a user-guide index README linking to the new guide.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File Description
examples/modelconfig-openai-gateway-vllm.yaml Provides a ready-to-apply Secret + ModelConfig + Agent example for gateway→vLLM.
docs/user-guide/openai-compatible-gateway-vllm.md Documents setup steps, required vLLM flags for tool calling, and troubleshooting.
docs/user-guide/README.md Adds a user-guide landing page linking to the new gateway→vLLM guide.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread docs/user-guide/README.md
Comment on lines +5 to +7
| Guide | Description |
| --- | --- |
| [OpenAI-compatible gateway with vLLM](openai-compatible-gateway-vllm.md) | Wire kagent to Bifrost, LiteLLM, or another OpenAI-compatible gateway fronting self-hosted vLLM |
Comment on lines +79 to +84
| Field | What to set |
| --- | --- |
| `provider` | Always `OpenAI` for OpenAI-compatible gateways |
| `model` | Gateway routing identifier (for example `vllm/Qwen/...`) |
| `openAI.baseUrl` | Gateway OpenAI API base URL |
| `apiKeySecret` / `apiKeySecretKey` | Secret holding the gateway API key |

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checked the raw source — both tables use standard single-pipe syntax (| Guide | Description |); there's no || in the files. This looks like a diff-rendering artifact, and the tables render correctly on GitHub. No change needed.

Comment on lines +20 to +21
stringData:
key: sk-gateway-example # replace with your gateway API key
Comment on lines +60 to +61
stringData:
key: sk-gateway-example # replace with your gateway key, or any placeholder if auth is disabled

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — replaced sk-gateway-example with REPLACE_WITH_GATEWAY_API_KEY in both the example manifest and the guide to avoid secret-scanner false positives. Fixed in the latest commit.

Signed-off-by: AmitChaubey <amit.katyayana@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants