Model Routing

What model routing does

Every task that runs through AI Partner is classified into a task type, and the Model Router sends it to the most appropriate LLM for that type. Expensive frontier models handle complex reasoning; fast, cheap models handle classification; vision-capable models handle screenshots.

This happens automatically — you don't need to specify a model for each request.

Task types and default routing

Task type	Default model tier	Example tasks
`reasoning`	Strong / frontier	Complex multi-step goals, analysis, strategy
`coding`	Code-specialized	Python scripts, Node.js, debugging
`classification`	Fast / cheap	Intent detection, routing, short decisions
`computer_use`	Vision-capable	Browser screenshots, UI navigation
`search_grounded`	Perplexity (if configured)	Web-grounded research with citations
`summarization`	Mid-tier	Condensing long documents
`embedding`	Embedding model	Vector search, RAG

The Model Routing Dashboard

Go to sidebar → Model Routing to see and edit the routing configuration:

┌─────────────────────────────────────────────────────────────────┐
│ Model Routing Dashboard                                          │
├────────────────┬─────────────────────────┬──────────────────────┤
│ Task type      │ Current model           │ Fallback             │
├────────────────┼─────────────────────────┼──────────────────────┤
│ reasoning      │ claude-sonnet-4-5       │ gpt-4o               │
│ coding         │ deepseek-coder-v2       │ claude-3-5-haiku     │
│ classification │ llama-3.3-70b (Groq)    │ claude-3-5-haiku     │
│ computer_use   │ gpt-4o                  │ gemini-2.0-flash     │
│ search_grounded│ sonar-pro (Perplexity)  │ claude-sonnet-4-5    │
│ summarization  │ claude-3-5-haiku        │ llama-3.3-70b        │
│ embedding      │ text-embedding-3-small  │ TF-IDF               │
└────────────────┴─────────────────────────┴──────────────────────┘

Click any row to change the model assigned to that task type.

How task classification works

Before routing, the agent classifies the incoming task. For a goal like:

"Write a Python script to fetch NIFTY 50 data and plot it as a chart"

The classifier sees keywords: "Python", "script", "fetch", "plot" → classifies as coding → routes to the coding model.

For:

"Should I raise our Series A at $40M or $50M pre-money given current market conditions?"

Keywords: "should I", "strategy", "market conditions" → classifies as reasoning → routes to the strongest model.

Classification itself uses the classification model (fast and cheap) — this overhead is typically < 100ms.

Overriding the router

For a single message, specify the model directly:

Using claude-opus-4-7: write me a comprehensive analysis of...

Or use the @model syntax (if configured):

@gpt-4o What do you think about...

Globally, change the model for a task type in the Model Routing Dashboard. The new routing takes effect immediately.

Available providers by tier

Strong / reasoning tier

claude-sonnet-4-5 (Anthropic)
claude-opus-4-7 (Anthropic, most capable)
gpt-4o (OpenAI)
gemini-2.0-flash (Google)
minimax-m2.7 (200k context)

Coding tier

deepseek-coder-v2 (DeepSeek)
codestral (Mistral)
claude-3-5-haiku-20241022 (Anthropic, fast+capable)

Fast / cheap tier

llama-3.3-70b-versatile (Groq, ~100k t/s)
llama-3.1-8b-instant (Groq, fastest)
llama-3.3-70b (Cerebras, ~100k t/s)
claude-3-5-haiku-20241022 (Anthropic)

Vision / computer use tier

gpt-4o (OpenAI)
gemini-2.0-flash (Google)
claude-opus-4-7 (Anthropic)

Search-grounded tier

sonar-pro (Perplexity, citations built-in)
sonar (Perplexity, faster/cheaper)

Fallbacks

Each task type has a fallback model used when the primary is unavailable (API error, rate limit, key missing). Fallbacks are tried in order until one responds.

If all models for a task type fail, the goal escalates to you with an error notification.

Cost impact of routing

Routing to cheaper models for classification and summarization saves significant cost. Example:

Without routing	With routing
All 142 calls → claude-sonnet-4-5	28 complex calls → claude-sonnet-4-5
Cost: ~$2.40	89 classification calls → Groq llama (~free)
	25 summary calls → claude-3-5-haiku
	Cost: ~$0.52

~78% cost reduction by using the right model for each task type.

What model routing does​

Task types and default routing​

The Model Routing Dashboard​

How task classification works​

Overriding the router​

Available providers by tier​

Strong / reasoning tier​

Coding tier​

Fast / cheap tier​

Vision / computer use tier​

Search-grounded tier​

Fallbacks​

Cost impact of routing​