Skip to main content

Model Routing

What model routing does

Every task that runs through AI Partner is classified into a task type, and the Model Router sends it to the most appropriate LLM for that type. Expensive frontier models handle complex reasoning; fast, cheap models handle classification; vision-capable models handle screenshots.

This happens automatically — you don't need to specify a model for each request.


Task types and default routing

Task typeDefault model tierExample tasks
reasoningStrong / frontierComplex multi-step goals, analysis, strategy
codingCode-specializedPython scripts, Node.js, debugging
classificationFast / cheapIntent detection, routing, short decisions
computer_useVision-capableBrowser screenshots, UI navigation
search_groundedPerplexity (if configured)Web-grounded research with citations
summarizationMid-tierCondensing long documents
embeddingEmbedding modelVector search, RAG

The Model Routing Dashboard

Go to sidebar → Model Routing to see and edit the routing configuration:

┌─────────────────────────────────────────────────────────────────┐
│ Model Routing Dashboard │
├────────────────┬─────────────────────────┬──────────────────────┤
│ Task type │ Current model │ Fallback │
├────────────────┼─────────────────────────┼──────────────────────┤
│ reasoning │ claude-sonnet-4-5 │ gpt-4o │
│ coding │ deepseek-coder-v2 │ claude-3-5-haiku │
│ classification │ llama-3.3-70b (Groq) │ claude-3-5-haiku │
│ computer_use │ gpt-4o │ gemini-2.0-flash │
│ search_grounded│ sonar-pro (Perplexity) │ claude-sonnet-4-5 │
│ summarization │ claude-3-5-haiku │ llama-3.3-70b │
│ embedding │ text-embedding-3-small │ TF-IDF │
└────────────────┴─────────────────────────┴──────────────────────┘

Click any row to change the model assigned to that task type.


How task classification works

Before routing, the agent classifies the incoming task. For a goal like:

"Write a Python script to fetch NIFTY 50 data and plot it as a chart"

The classifier sees keywords: "Python", "script", "fetch", "plot" → classifies as coding → routes to the coding model.

For:

"Should I raise our Series A at $40M or $50M pre-money given current market conditions?"

Keywords: "should I", "strategy", "market conditions" → classifies as reasoning → routes to the strongest model.

Classification itself uses the classification model (fast and cheap) — this overhead is typically < 100ms.


Overriding the router

For a single message, specify the model directly:

Using claude-opus-4-7: write me a comprehensive analysis of...

Or use the @model syntax (if configured):

@gpt-4o What do you think about...

Globally, change the model for a task type in the Model Routing Dashboard. The new routing takes effect immediately.


Available providers by tier

Strong / reasoning tier

  • claude-sonnet-4-5 (Anthropic)
  • claude-opus-4-7 (Anthropic, most capable)
  • gpt-4o (OpenAI)
  • gemini-2.0-flash (Google)
  • minimax-m2.7 (200k context)

Coding tier

  • deepseek-coder-v2 (DeepSeek)
  • codestral (Mistral)
  • claude-3-5-haiku-20241022 (Anthropic, fast+capable)

Fast / cheap tier

  • llama-3.3-70b-versatile (Groq, ~100k t/s)
  • llama-3.1-8b-instant (Groq, fastest)
  • llama-3.3-70b (Cerebras, ~100k t/s)
  • claude-3-5-haiku-20241022 (Anthropic)

Vision / computer use tier

  • gpt-4o (OpenAI)
  • gemini-2.0-flash (Google)
  • claude-opus-4-7 (Anthropic)

Search-grounded tier

  • sonar-pro (Perplexity, citations built-in)
  • sonar (Perplexity, faster/cheaper)

Fallbacks

Each task type has a fallback model used when the primary is unavailable (API error, rate limit, key missing). Fallbacks are tried in order until one responds.

If all models for a task type fail, the goal escalates to you with an error notification.


Cost impact of routing

Routing to cheaper models for classification and summarization saves significant cost. Example:

Without routingWith routing
All 142 calls → claude-sonnet-4-528 complex calls → claude-sonnet-4-5
Cost: ~$2.4089 classification calls → Groq llama (~free)
25 summary calls → claude-3-5-haiku
Cost: ~$0.52

~78% cost reduction by using the right model for each task type.