How It Works
From goal to result
When you type a goal, AI Partner doesn't generate a single response and stop. It runs a loop:
You type a goal
│
▼
Decompose
(break into sub-tasks with dependencies)
│
▼
┌─────────────────────────────────────────┐
│ ReAct Loop │
│ │
│ 1. Reason — "What should I do next?" │
│ 2. Act — call a tool │
│ 3. Assess — did it work? │
│ 4. Replan — update the plan │
│ │
│ (repeats up to 40 times per goal) │
└─────────────────────────────────────────┘
│
▼
Validate
(check success criteria — file exists? message sent? data correct?)
│
▼
Deliver
(file in Files panel, message sent, summary in chat)
This loop — Reason, Act, Assess — is called a ReAct loop. It's what separates AI Partner from a chatbot. A chatbot generates one response. AI Partner loops until the goal is provably done.
Three execution modes
1. Chat mode
For quick questions and short tasks. The agent reasons and responds in a single turn, streaming text back to you like a standard LLM. Limited to read-only tools (no file writes or external actions).
You: What's the current price of RELIANCE.NS?
Agent: ₹2,847.50 as of 11:32 AM IST (source: NSE)
2. Goal mode
For multi-step tasks. Activates the full ReAct loop with task decomposition, self-correction, parallel sub-agents, and success validation. Use the rocket icon or prefix your message with a goal-like phrase.
You: Research the top 5 cloud providers by market share, write a comparison table,
and generate a PowerPoint deck I can use in Monday's board meeting.
The agent runs for 30–120 seconds (depending on complexity), then delivers the .pptx file.
3. Meeting proxy mode
Triggered by a meeting URL. Bypasses the goal executor entirely and starts the container-based meeting attendance pipeline directly.
You: Join https://teams.live.com/meet/... as "Alex", focus on our Q2 roadmap discussion.
The agent boots a container with a virtual desktop, navigates to the meeting, joins, listens, and responds.
How goals are decomposed
When a goal arrives, the engine breaks it into typed sub-tasks first:
| Sub-task type | Example |
|---|---|
research | "Find the top 5 cloud providers by 2025 market share" |
compute | "Calculate year-over-year growth from these numbers" |
generate_file | "Create a PowerPoint with the comparison table" |
send_message | "Send the file to my Telegram" |
validate | "Confirm the file was created and has at least 5 slides" |
Sub-tasks have dependencies — generate_file won't start until research is done. Independent sub-tasks can run in parallel (up to 5 concurrent agents).
The tool system (MCP)
AI Partner uses the Model Context Protocol (MCP) — an open standard for connecting LLMs to tools. Every capability is a tool registered on an MCP server.
Tools are grouped into 41 MCP servers:
| Category | Servers | Example tools |
|---|---|---|
| Web | WebSearch, Browser | web_search, browser_navigate, browser_extract |
| Execution | Shell, Python Sandbox, Node.js Sandbox | execute_command, execute_python, execute_nodejs |
| Files | Shell | read_file, write_file, list_directory |
| Documents | DocumentGen | generate_powerpoint, generate_excel, generate_word, generate_pdf |
| Communication | Gmail, Slack, Messaging, Phone | send_email, slack_send, telegram_send, place_call |
| DevOps | GitHub, Jira, Confluence, Sentry | create_issue, search_repos, list_issues |
| CRM / Finance | HubSpot, Stripe | hubspot_search_contacts, stripe_list_customers |
| Cloud | AWS S3, Outlook | s3_upload, outlook_send_email |
| Knowledge | KnowledgeBase, arXiv, RSS | knowledge_search, search_papers, fetch_feed |
| Media | ImageGen, VideoGen, Transcribe | image_generate, video_generate, transcribe_audio |
| AI utility | Memory, AgentDelegation | memory_search, delegate_task, delegate_parallel |
The agent sees all available tools at the start of each reasoning step and picks the right one. You don't need to tell it which tool to use.
How the agent picks the right LLM
AI Partner has 18+ LLM providers and a model router that assigns tasks to the best model automatically:
| Task type | Model tier | Example providers |
|---|---|---|
| Complex reasoning | Strong / frontier | Claude 4, GPT-4o, Gemini 2.0 |
| Code generation | Coding-specialized | DeepSeek Coder, Claude 3.5 |
| Classification | Fast / cheap | Groq Llama-3.3-70b, Cerebras |
| Vision / computer use | Multimodal | GPT-4o, Gemini 2.0 Flash |
| Search-grounded | Perplexity | sonar-pro (citations built-in) |
You can override routing manually in Settings → Model Routing or by prefixing your message with @model-name.
Self-correction and stuck detection
The executor monitors its own progress:
- Self-correction: if a generated script fails, the agent diagnoses the error semantically and rewrites the script — up to 3 retries before escalating to you.
- Stuck detection: if the agent calls the same tool with the same arguments 3 times in a row without progress, it force-breaks the loop and replans.
- Timeout: goals have a wall-clock timeout (configurable; default 10 minutes). If the goal hasn't completed by then, the agent delivers whatever it has and escalates.
When the agent asks for your help (HITL)
Some situations the agent can't handle alone. It pauses and notifies you for 7 escalation types:
| Situation | Example |
|---|---|
| Permission needed | "Should I delete these 50 files?" |
| Credential required | OAuth flow or 2FA code |
| Goal is ambiguous | The goal has two plausible interpretations |
| Budget threshold | Estimated API cost exceeds your limit |
| Destructive action | Writing to a production database |
| CAPTCHA / auth wall | Browser is blocked; agent pauses and shows you the screen |
| Quality gate | Output doesn't meet the confidence threshold |
HITL notifications arrive via:
- The web UI — an approval card appears in the active chat
- Telegram — if configured, a message with approve/reject buttons
- Browser CAPTCHA — a live screenshot streamed to the UI with a "Take Control" button
After you respond, the agent resumes from exactly where it paused.
The full pipeline end-to-end
Here's what actually happens when you type "Research NVIDIA's Q1 2026 earnings and make me a slide deck":
1. The router receives the message
2. It's classified as a "goal" (not simple chat)
3. The goal engine starts
4. The goal is broken into sub-tasks:
- [research] Find NVIDIA Q1 2026 earnings data
- [research] Find analyst commentary and reactions
- [compute] Summarize key metrics
- [generate] Create PowerPoint with charts
- [validate] Confirm file exists with correct slide count
5. SkillLearner checks: any matching skill in the library? → no match
6. ReAct loop begins:
Iteration 1: reason → web_search("NVIDIA Q1 2026 earnings") → got results
Iteration 2: reason → web_fetch(earnings page URL) → extracted numbers
Iteration 3: reason → web_search("NVIDIA Q1 2026 analyst reactions") → got results
Iteration 4: reason → execute_python(summarize script) → got structured data
Iteration 5: reason → generate_powerpoint(data) → created file
Iteration 6: reason → validate_file(path) → file exists, 8 slides → done
7. GoalValidator confirms: success criteria met
8. SkillLearner saves the script as a reusable skill
9. File appears in Files panel; chat shows download link
10. If Telegram configured: file sent to your Telegram
Total time: approximately 45–90 seconds.
Next time you ask for an earnings slide deck, step 5 finds the saved skill and the loop runs the stored script directly — skipping generation entirely, completing in under 10 seconds.