Skip to main content

How It Works

From goal to result

When you type a goal, AI Partner doesn't generate a single response and stop. It runs a loop:

You type a goal


Decompose
(break into sub-tasks with dependencies)


┌─────────────────────────────────────────┐
│ ReAct Loop │
│ │
│ 1. Reason — "What should I do next?" │
│ 2. Act — call a tool │
│ 3. Assess — did it work? │
│ 4. Replan — update the plan │
│ │
│ (repeats up to 40 times per goal) │
└─────────────────────────────────────────┘


Validate
(check success criteria — file exists? message sent? data correct?)


Deliver
(file in Files panel, message sent, summary in chat)

This loop — Reason, Act, Assess — is called a ReAct loop. It's what separates AI Partner from a chatbot. A chatbot generates one response. AI Partner loops until the goal is provably done.


Three execution modes

1. Chat mode

For quick questions and short tasks. The agent reasons and responds in a single turn, streaming text back to you like a standard LLM. Limited to read-only tools (no file writes or external actions).

You: What's the current price of RELIANCE.NS?
Agent: ₹2,847.50 as of 11:32 AM IST (source: NSE)

2. Goal mode

For multi-step tasks. Activates the full ReAct loop with task decomposition, self-correction, parallel sub-agents, and success validation. Use the rocket icon or prefix your message with a goal-like phrase.

You: Research the top 5 cloud providers by market share, write a comparison table,
and generate a PowerPoint deck I can use in Monday's board meeting.

The agent runs for 30–120 seconds (depending on complexity), then delivers the .pptx file.

3. Meeting proxy mode

Triggered by a meeting URL. Bypasses the goal executor entirely and starts the container-based meeting attendance pipeline directly.

You: Join https://teams.live.com/meet/... as "Alex", focus on our Q2 roadmap discussion.

The agent boots a container with a virtual desktop, navigates to the meeting, joins, listens, and responds.


How goals are decomposed

When a goal arrives, the engine breaks it into typed sub-tasks first:

Sub-task typeExample
research"Find the top 5 cloud providers by 2025 market share"
compute"Calculate year-over-year growth from these numbers"
generate_file"Create a PowerPoint with the comparison table"
send_message"Send the file to my Telegram"
validate"Confirm the file was created and has at least 5 slides"

Sub-tasks have dependenciesgenerate_file won't start until research is done. Independent sub-tasks can run in parallel (up to 5 concurrent agents).


The tool system (MCP)

AI Partner uses the Model Context Protocol (MCP) — an open standard for connecting LLMs to tools. Every capability is a tool registered on an MCP server.

Tools are grouped into 41 MCP servers:

CategoryServersExample tools
WebWebSearch, Browserweb_search, browser_navigate, browser_extract
ExecutionShell, Python Sandbox, Node.js Sandboxexecute_command, execute_python, execute_nodejs
FilesShellread_file, write_file, list_directory
DocumentsDocumentGengenerate_powerpoint, generate_excel, generate_word, generate_pdf
CommunicationGmail, Slack, Messaging, Phonesend_email, slack_send, telegram_send, place_call
DevOpsGitHub, Jira, Confluence, Sentrycreate_issue, search_repos, list_issues
CRM / FinanceHubSpot, Stripehubspot_search_contacts, stripe_list_customers
CloudAWS S3, Outlooks3_upload, outlook_send_email
KnowledgeKnowledgeBase, arXiv, RSSknowledge_search, search_papers, fetch_feed
MediaImageGen, VideoGen, Transcribeimage_generate, video_generate, transcribe_audio
AI utilityMemory, AgentDelegationmemory_search, delegate_task, delegate_parallel

The agent sees all available tools at the start of each reasoning step and picks the right one. You don't need to tell it which tool to use.


How the agent picks the right LLM

AI Partner has 18+ LLM providers and a model router that assigns tasks to the best model automatically:

Task typeModel tierExample providers
Complex reasoningStrong / frontierClaude 4, GPT-4o, Gemini 2.0
Code generationCoding-specializedDeepSeek Coder, Claude 3.5
ClassificationFast / cheapGroq Llama-3.3-70b, Cerebras
Vision / computer useMultimodalGPT-4o, Gemini 2.0 Flash
Search-groundedPerplexitysonar-pro (citations built-in)

You can override routing manually in Settings → Model Routing or by prefixing your message with @model-name.


Self-correction and stuck detection

The executor monitors its own progress:

  • Self-correction: if a generated script fails, the agent diagnoses the error semantically and rewrites the script — up to 3 retries before escalating to you.
  • Stuck detection: if the agent calls the same tool with the same arguments 3 times in a row without progress, it force-breaks the loop and replans.
  • Timeout: goals have a wall-clock timeout (configurable; default 10 minutes). If the goal hasn't completed by then, the agent delivers whatever it has and escalates.

When the agent asks for your help (HITL)

Some situations the agent can't handle alone. It pauses and notifies you for 7 escalation types:

SituationExample
Permission needed"Should I delete these 50 files?"
Credential requiredOAuth flow or 2FA code
Goal is ambiguousThe goal has two plausible interpretations
Budget thresholdEstimated API cost exceeds your limit
Destructive actionWriting to a production database
CAPTCHA / auth wallBrowser is blocked; agent pauses and shows you the screen
Quality gateOutput doesn't meet the confidence threshold

HITL notifications arrive via:

  • The web UI — an approval card appears in the active chat
  • Telegram — if configured, a message with approve/reject buttons
  • Browser CAPTCHA — a live screenshot streamed to the UI with a "Take Control" button

After you respond, the agent resumes from exactly where it paused.


The full pipeline end-to-end

Here's what actually happens when you type "Research NVIDIA's Q1 2026 earnings and make me a slide deck":

1. The router receives the message
2. It's classified as a "goal" (not simple chat)
3. The goal engine starts
4. The goal is broken into sub-tasks:
- [research] Find NVIDIA Q1 2026 earnings data
- [research] Find analyst commentary and reactions
- [compute] Summarize key metrics
- [generate] Create PowerPoint with charts
- [validate] Confirm file exists with correct slide count
5. SkillLearner checks: any matching skill in the library? → no match
6. ReAct loop begins:
Iteration 1: reason → web_search("NVIDIA Q1 2026 earnings") → got results
Iteration 2: reason → web_fetch(earnings page URL) → extracted numbers
Iteration 3: reason → web_search("NVIDIA Q1 2026 analyst reactions") → got results
Iteration 4: reason → execute_python(summarize script) → got structured data
Iteration 5: reason → generate_powerpoint(data) → created file
Iteration 6: reason → validate_file(path) → file exists, 8 slides → done
7. GoalValidator confirms: success criteria met
8. SkillLearner saves the script as a reusable skill
9. File appears in Files panel; chat shows download link
10. If Telegram configured: file sent to your Telegram

Total time: approximately 45–90 seconds.

Next time you ask for an earnings slide deck, step 5 finds the saved skill and the loop runs the stored script directly — skipping generation entirely, completing in under 10 seconds.