Model Routing
What model routing does
Every task that runs through AI Partner is classified into a task type, and the Model Router sends it to the most appropriate LLM for that type. Expensive frontier models handle complex reasoning; fast, cheap models handle classification; vision-capable models handle screenshots.
This happens automatically — you don't need to specify a model for each request.
Task types and default routing
| Task type | Default model tier | Example tasks |
|---|---|---|
reasoning | Strong / frontier | Complex multi-step goals, analysis, strategy |
coding | Code-specialized | Python scripts, Node.js, debugging |
classification | Fast / cheap | Intent detection, routing, short decisions |
computer_use | Vision-capable | Browser screenshots, UI navigation |
search_grounded | Perplexity (if configured) | Web-grounded research with citations |
summarization | Mid-tier | Condensing long documents |
embedding | Embedding model | Vector search, RAG |
The Model Routing Dashboard
Go to sidebar → Model Routing to see and edit the routing configuration:
┌─────────────────────────────────────────────────────────────────┐
│ Model Routing Dashboard │
├────────────────┬─────────────────────────┬──────────────────────┤
│ Task type │ Current model │ Fallback │
├────────────────┼─────────────────────────┼──────────────────────┤
│ reasoning │ claude-sonnet-4-5 │ gpt-4o │
│ coding │ deepseek-coder-v2 │ claude-3-5-haiku │
│ classification │ llama-3.3-70b (Groq) │ claude-3-5-haiku │
│ computer_use │ gpt-4o │ gemini-2.0-flash │
│ search_grounded│ sonar-pro (Perplexity) │ claude-sonnet-4-5 │
│ summarization │ claude-3-5-haiku │ llama-3.3-70b │
│ embedding │ text-embedding-3-small │ TF-IDF │
└────────────────┴─────────────────────────┴──────────────────────┘
Click any row to change the model assigned to that task type.
How task classification works
Before routing, the agent classifies the incoming task. For a goal like:
"Write a Python script to fetch NIFTY 50 data and plot it as a chart"
The classifier sees keywords: "Python", "script", "fetch", "plot" → classifies as coding → routes to the coding model.
For:
"Should I raise our Series A at $40M or $50M pre-money given current market conditions?"
Keywords: "should I", "strategy", "market conditions" → classifies as reasoning → routes to the strongest model.
Classification itself uses the classification model (fast and cheap) — this overhead is typically < 100ms.
Overriding the router
For a single message, specify the model directly:
Using claude-opus-4-7: write me a comprehensive analysis of...
Or use the @model syntax (if configured):
@gpt-4o What do you think about...
Globally, change the model for a task type in the Model Routing Dashboard. The new routing takes effect immediately.
Available providers by tier
Strong / reasoning tier
claude-sonnet-4-5(Anthropic)claude-opus-4-7(Anthropic, most capable)gpt-4o(OpenAI)gemini-2.0-flash(Google)minimax-m2.7(200k context)
Coding tier
deepseek-coder-v2(DeepSeek)codestral(Mistral)claude-3-5-haiku-20241022(Anthropic, fast+capable)
Fast / cheap tier
llama-3.3-70b-versatile(Groq, ~100k t/s)llama-3.1-8b-instant(Groq, fastest)llama-3.3-70b(Cerebras, ~100k t/s)claude-3-5-haiku-20241022(Anthropic)
Vision / computer use tier
gpt-4o(OpenAI)gemini-2.0-flash(Google)claude-opus-4-7(Anthropic)
Search-grounded tier
sonar-pro(Perplexity, citations built-in)sonar(Perplexity, faster/cheaper)
Fallbacks
Each task type has a fallback model used when the primary is unavailable (API error, rate limit, key missing). Fallbacks are tried in order until one responds.
If all models for a task type fail, the goal escalates to you with an error notification.
Cost impact of routing
Routing to cheaper models for classification and summarization saves significant cost. Example:
| Without routing | With routing |
|---|---|
| All 142 calls → claude-sonnet-4-5 | 28 complex calls → claude-sonnet-4-5 |
| Cost: ~$2.40 | 89 classification calls → Groq llama (~free) |
| 25 summary calls → claude-3-5-haiku | |
| Cost: ~$0.52 |
~78% cost reduction by using the right model for each task type.