Back to Merlin|Feature Specs

AI Chat Pipeline

Status: Live

Overview

The AI Chat Pipeline is the primary interface through which users interact with Merlin. It accepts free-form natural language messages, classifies intent via Claude Haiku tool use, and produces either a streamed conversational response or a structured trade confirmation card that the user signs on-chain. All responses are delivered as Server-Sent Events (SSE), giving the frontend real-time streaming text with no polling.

Architecture

User message
  │
  ▼
POST /chat  (FastAPI StreamingResponse, media_type="text/event-stream")
  │
  ▼
chat()  [backend/services/chat.py]
  │
  ├─ Persist user message → Firestore (users/{uid}/conversations/{cid}/messages)
  │
  ├─ Build Claude messages array (system prompt + last 50 messages from Firestore)
  │
  ▼
Claude Haiku  — stream=True, tool_choice="auto"
  │
  ├── finish_reason == "stop"
  │     └─ Stream text chunks as {"type": "text", "content": "..."}
  │        Persist final text → Firestore
  │
  └── finish_reason == "tool_calls"
        │
        ├── parse_trade_intent(side, asset, amount, amount_type)
        │     │
        │     ├─ xStock resolver  [services/xstock.py]
        │     │     resolve_token(asset_query) → matched_token + confidence
        │     │     Low confidence (<0.8) → emit ambiguous_asset tool result → Claude asks clarification
        │     │
        │     ├─ Guardrails  [services/guardrails.py]
        │     │     validate_trade(user_id, intent) → approved | blocked + reason
        │     │
        │     ├─ Uniswap V3 quote  [services/uniswap.py]
        │     │     uniswap_get_quote(token_in, token_out, amount_in) → estimated_output
        │     │
        │     ├─ save_quoted_trade()  [db/trades.py]  → trade_id in Firestore
        │     │
        │     ├─ Emit {"type": "trade_intent", "data": {...}}  → frontend renders confirmation card
        │     │
        │     └─ Second Claude call with tool result → stream confirmation text
        │
        ├── get_price(asset)
        │     resolve_token() → symbol → get_token_price() → stream price text
        │
        └── get_portfolio()
              get_user_by_id() → wallet address
              get_all_balances() + get_prices_batch() → stream portfolio text

  ▼
Emit {"type": "done", "conversation_id": "..."}

  ▼  (on trade_intent event — frontend side)
User confirms trade confirmation card
  └─ executeSwap() via useAuth hook → wallet signs → on-chain submission

Implementation Details

Claude Tool Use — 3 Tools

Tool Trigger Parameters
parse_trade_intent User wants to buy or sell side (buy/sell), asset (string), amount (number), amount_type (usd/quantity)
get_price User asks about a price asset (string)
get_portfolio User asks about portfolio/balance/holdings none

All three tools go through a second Claude streaming call after the tool result is produced, so the user always receives a natural language follow-up in addition to any structured event.

SSE Event Protocol

Every SSE frame is a JSON object on a data: line:

Event type Payload Purpose
text {"type": "text", "content": "..."} Streaming AI text chunk
trade_intent {"type": "trade_intent", "data": {...}} Structured trade — frontend renders confirmation card
error {"type": "error", "content": "..."} Recoverable error message
done {"type": "done", "conversation_id": "..."} Stream complete

The trade_intent data payload includes: trade_id, side, asset, symbol, amount, amount_type, guardrails, and optionally estimated_output + estimated_output_symbol from the live Uniswap V3 quote.

System Prompt

The system prompt (SYSTEM_PROMPT in backend/services/chat.py) defines Merlin's behavior:

  • Establishes identity as a privacy-preserving Ethereum trading assistant
  • Instructs when to call each of the three tools
  • Prohibits financial advice
  • Requires trade confirmation before execution
  • Clarifies that xStocks are tracker certificates, not share ownership
  • Lists the primary xStock symbols (xTSLA, xAAPL, xGOOG, xAMZN, xMSFT, xNVDA, xMETA, xNFLX, xCOIN, xPLTR, xGME, xSPY, xQQQ, xGLD, and 50+ more)
  • Lists crypto assets: ETH, USDC, USDT, WETH
  • Enforces the US-persons / sanctioned-countries compliance block for xStocks
  • Requires clarification when the request is ambiguous or the amount is missing

Intent Parsing — Natural Language to Structured Intent

Claude Haiku extracts the following fields via parse_trade_intent:

Field Type Values
side string buy \ sell
asset string raw user input (e.g., "Tesla", "TSLA", "xTSLA")
amount number dollar value or token quantity
amount_type string usd \ quantity

The asset string is then passed to the xStock resolver.

Asset Resolution — xStock Resolver

resolve_token(asset_query) in backend/services/xstock.py performs fuzzy matching against 61 tokens (xStocks + crypto). It returns:

  • match — the resolved token dict (symbol, name, address, backed_ticker, etc.)
  • confidence — 0.0–1.0 match confidence
  • alternatives — other candidate symbols when confidence is low

If confidence < 0.8 and alternatives exist, the tool result signals ambiguous_asset back to Claude, which then asks the user to clarify — the trade is not queued.

Guardrail Validation

Every parsed trade passes through validate_trade(user_id, intent) in backend/services/guardrails.py before a quote is requested or a trade is stored. If approved is False, the tool result carries the blocking reason and the trade is rejected — Claude informs the user in plain language.

Uniswap V3 Quote

After guardrails pass, uniswap_get_quote(token_in, token_out, amount_in) is called. The quote is best-effort: if it fails (e.g., no pool liquidity, placeholder address), the trade is still saved with quote_note explaining why a quote is unavailable. The quoted output is included in the trade_intent SSE event and in the GPT follow-up text.

Conversation Persistence

  • Every user and assistant message is stored in Firestore immediately.
  • The context window sent to Claude is the system prompt plus the last 50 messages (_build_claude_messages).
  • Assistant messages that follow a tool call carry metadata.function_call for auditability.
  • Conversation updated_at is touched on every new message.

Conversation Sessions

Users can maintain multiple named conversations. Sessions are scoped per user:

  • Created automatically on the first message if no conversation_id is provided.
  • Title is auto-generated from the first 50 characters of the opening message.
  • Listed ordered by updated_at descending.

AI Model Preference

Users can switch between allowed Claude models. The preference is stored as ai_model on the Firestore user document and returned by GET /chat/provider. Allowed values: claude-haiku-4-5-20251001, claude-sonnet-4-20250514, claude-opus-4-20250514. The frontend also maintains a local localStorage preference under merlin_preferred_model.

Code Map

File Purpose
backend/services/chat.py Core chat service — Anthropic client, streaming SSE generator, all three tool handlers, conversation history builder
backend/routers/chat.py FastAPI router — all 8 chat/market endpoints, request/response models
backend/db/conversations.py Firestore CRUD for conversations and messages
backend/db/trades.py Firestore CRUD for trade records; save_quoted_trade() called by chat service
backend/services/xstock.py xStock token registry (61 tokens), resolve_token() fuzzy matcher, list_all_assets()
backend/services/guardrails.py validate_trade() — runs all safety checks on a parsed trade intent
backend/services/uniswap.py get_quote(), resolve_swap_addresses(), get_token_decimals(), WETH constant
backend/services/prices.py get_token_price(), get_prices_batch(), is_xstock()
backend/services/balances.py get_all_balances() — on-chain ERC-20 balance scan for portfolio queries
backend/db/users.py get_user_by_id() — used by portfolio handler to look up wallet address
frontend/app/chat/page.tsx Full chat UI — SSE consumer, message list, trade confirmation cards, voice input (Web Speech API), TTS (browser speechSynthesis), persona selector, model selector, language selector, session sidebar

API Endpoints

Method Path Auth Description
POST /chat Bearer JWT Stream a chat message; returns SSE. Body: {message, conversation_id?}
GET /chat/history Bearer JWT Fetch messages for a conversation. Query: conversation_id, limit (default 100, max 500)
DELETE /chat/history Bearer JWT Delete all messages in a conversation. Query: conversation_id
GET /chat/sessions Bearer JWT List user's conversations, most recent first. Query: limit (default 50, max 200)
POST /chat/sessions Bearer JWT Create a new empty conversation. Returns conversation doc
DELETE /chat/sessions Bearer JWT Delete a conversation and all its messages. Query: conversation_id
GET /chat/provider Bearer JWT Get user's stored AI model preference
PATCH /chat/provider Bearer JWT Update AI model preference. Body: {model}. Allowed: claude-haiku-4-5-20251001, claude-sonnet-4-20250514, claude-opus-4-20250514
GET /market/assets Bearer JWT List all tradable assets. Query: asset_type (stock, etf, commodity_etf, crypto)

Firestore Schema

users/{userId}/
  conversations/{conversationId}/
    id          string   — document ID
    title       string   — auto-generated from first user message (truncated at 50 chars)
    created_at  string   — ISO-8601 UTC
    updated_at  string   — ISO-8601 UTC (touched on every new message)

    messages/{messageId}/
      id          string   — document ID
      role        string   — "user" | "assistant" | "system"
      content     string   — message text
      created_at  string   — ISO-8601 UTC
      metadata    map      — optional; keys: function_call (tool name), trade_intent (object)

  trades/{tradeId}/
    id              string
    type            string   — "buy" | "sell" | "swap"
    asset_in        string   — symbol sold
    asset_out       string   — symbol bought
    amount_in       number
    amount_out      number
    price_usd       number | null
    tx_hash         string   — on-chain tx hash (empty string when status is "quoted")
    status          string   — "quoted" | "pending" | "confirmed" | "failed"
    privacy_mode    string   — "public" | "shielded" | "compliant"
    created_at      string   — ISO-8601 UTC
    conversation_id string   — conversation that produced this trade

  ai_model          string   — stored on the user document; AI model preference

Configuration

Variable Location Description
ANTHROPIC_API_KEY Secret Manager / .env Required. Used to instantiate AsyncAnthropic client. Raises RuntimeError at first request if missing.
Model name backend/services/chat.pyMODEL = "claude-haiku-4-5-20251001" Default model for all chat completions.
System prompt backend/services/chat.pySYSTEM_PROMPT Defines Merlin identity, tool usage rules, xStock compliance rules, and communication style.
Context window limit backend/services/chat.py_build_claude_messages(), limit=50 Number of prior messages included in each Claude call.
Allowed models backend/routers/chat.pyallowed_models set claude-haiku-4-5-20251001, claude-sonnet-4-20250514, claude-opus-4-20250514. Requests for other values return HTTP 400.

Current Limitations

  • No persona integration in the LLM call. The frontend renders a persona selector (Elon, Buffett, AI Momentum) and persists the chosen persona ID, but the selected persona is not passed to the backend and does not alter the system prompt or tool behavior. Persona-aware context injection is not yet implemented.
  • Context window is a hard slice of 50 messages. There is no summarization or token-budget management. Long conversations will silently drop the oldest messages. This can cause the model to lose earlier intent signals in extended sessions.
  • No voice input backend. The frontend implements voice input via the browser Web Speech API (webkitSpeechRecognition) and TTS via window.speechSynthesis. Both are entirely client-side. There is no server-side STT or TTS pipeline.
  • Model selector is UI-only for non-Claude providers. The frontend offers Grok options in the model dropdown (MODEL_OPTIONS) but the backend only validates and uses Anthropic Claude models. Selecting Grok from the UI has no effect on the actual model used.
  • No streaming abort. There is no mechanism for the client to cancel an in-flight SSE stream (e.g., AbortController wired to a server-side cancellation). The stream runs to completion even if the user navigates away.
  • No multi-tool fan-out. A single user message can only trigger one tool call per streaming pass. Compound requests (e.g., "buy Tesla and show me my portfolio") are not split into parallel tool invocations.
  • Quoted trades are not automatically expired. Trades written to Firestore with status: "quoted" accumulate indefinitely. There is no TTL or cleanup job to remove stale unconfirmed quotes.

Related

  • specs/project-spec.md — full project specification
  • agents/chat-intent-parser.md — NLU agent definition for this pipeline's Node 1
  • agents/xstock-resolver.md — asset resolution agent
  • agents/guardrails.md — guardrail checks enforced on every trade
  • agents/trade-execution.md — downstream 6-step trade execution pipeline (quote → simulate → policy → execute → confirm → persist)
  • agents/persona-engine.md — persona system (not yet integrated into chat service)
  • sources/futurewallet-docs.md — FutureWallet platform documentation (xStocks trading mechanics reference)