LLM Providers
KafClaw supports multiple LLM backends through a unified provider layer. Each provider is identified by a canonical ID and accessed via a model string in the format <provider-id>/<model-name>.
Provider Matrix
| Provider ID | Auth Method | Default API Base | Example Model String |
|---|---|---|---|
claude | API key | https://api.anthropic.com/v1 | claude/claude-sonnet-4-5 |
openai | API key | (must be set) | openai/gpt-4o |
gemini | API key | Google AI Studio | gemini/gemini-2.5-pro |
gemini-cli | OAuth (CLI) | (via Gemini CLI) | gemini-cli/gemini-2.5-pro |
openai-codex | OAuth (CLI) | (via Codex CLI) | openai-codex/gpt-5.3-codex |
xai | API key | https://api.x.ai/v1 | xai/grok-3 |
scalytics-copilot | API key + base URL | (must be set) | scalytics-copilot/default |
openrouter | API key | https://openrouter.ai/api/v1 | openrouter/anthropic/claude-sonnet-4-5 |
deepseek | API key | https://api.deepseek.com/v1 | deepseek/deepseek-chat |
groq | API key | https://api.groq.com/openai/v1 | groq/llama-3.3-70b |
vllm | API key (optional) + base URL | (must be set) | vllm/my-model |
Provider Aliases
These shorthand names resolve automatically:
| Alias | Resolves To |
|---|---|
anthropic | claude |
google | gemini-cli (or gemini if an API key is configured) |
codex | openai-codex |
copilot | scalytics-copilot |
grok | xai |
Model String Format
All model references use the format provider-id/model-name:
claude/claude-opus-4-6
openai/gpt-4o
groq/llama-3.3-70b
openrouter/anthropic/claude-sonnet-4-5 # three segments for OpenRouter
A bare model name without a provider prefix (e.g. gpt-4o) falls back to the legacy OpenAI provider path.
Authentication
API Key Providers
Store an API key:
kafclaw models auth set-key --provider claude --key sk-ant-...
kafclaw models auth set-key --provider openai --key sk-...
kafclaw models auth set-key --provider gemini --key AIza...
kafclaw models auth set-key --provider xai --key xai-...
kafclaw models auth set-key --provider openrouter --key sk-or-...
kafclaw models auth set-key --provider deepseek --key sk-...
kafclaw models auth set-key --provider groq --key gsk_...
For providers that need a custom base URL:
kafclaw models auth set-key --provider scalytics-copilot --key <token> --base https://copilot.scalytics.io/v1
kafclaw models auth set-key --provider vllm --key <optional> --base http://localhost:8000/v1
API keys can also be set via config:
kafclaw config set providers.anthropic.apiKey sk-ant-...
kafclaw config set providers.openai.apiKey sk-...
OAuth Providers
Gemini and Codex use CLI-based OAuth:
kafclaw models auth login --provider gemini
kafclaw models auth login --provider openai-codex
This installs the provider CLI if absent, then delegates to its auth flow. Credentials are cached by the respective CLI and read at runtime.
Provider Resolution
When KafClaw needs an LLM provider, it resolves in this order:
- Per-agent model (
agents.list[].model.primary) - highest priority - Task-type routing (
model.taskRouting[category]) - if no per-agent model is set - Global model (
model.name) - default fallback - Legacy OpenAI (
providers.openai) - backward compatibility
Per-Agent Configuration
{
"agents": {
"list": [
{
"id": "main",
"model": {
"primary": "claude/claude-opus-4-6",
"fallbacks": ["openai/gpt-4o", "groq/llama-3.3-70b"]
},
"subagents": {
"model": "groq/llama-3.3-70b"
}
}
]
}
}
Task-Type Routing
Route messages to different models based on content classification:
{
"model": {
"name": "claude/claude-sonnet-4-5",
"taskRouting": {
"security": "claude/claude-opus-4-6",
"coding": "openai-codex/gpt-5.3-codex",
"creative": "openai/gpt-4o",
"tool-heavy": "openai-codex/gpt-5.3-codex"
}
}
}
Task categories are detected automatically from message content: security, coding, tool-heavy, creative.
Fallback Chains
When the primary provider returns a transient error, fallbacks are tried in order:
{
"model": {
"primary": "claude/claude-opus-4-6",
"fallbacks": [
"openai/gpt-4o",
"deepseek/deepseek-chat"
]
}
}
Subagent Model Inheritance
Subagents resolve their model in this order:
agents.list[parentID].subagents.modeltools.subagents.model(global subagent default)- Inherit parent agent’s resolved model
Rate Limits
Rate limit data is extracted from provider response headers (both OpenAI-style x-ratelimit-* and Anthropic-style anthropic-ratelimit-* headers) and cached in memory per provider.
View current rate limit snapshots:
kafclaw models stats
kafclaw status
kafclaw doctor warns when any provider’s remaining tokens drop below 10% of its limit.
Verifying Setup
# List all configured providers
kafclaw models list
# Check provider health
kafclaw doctor
# View today's usage per provider
kafclaw models stats
# Multi-day trend
kafclaw models stats --days 7