Skip to main content

Purpose

Streaming chat endpoint for business research. Uses OpenRouter (Cerebras) for fast inference with competitor context and real-time web search via Exa.

Architecture

Request Body

{
  "message": "How does competitor X compare to us?",
  "selected_competitor_ids": ["uuid-1", "uuid-2"],
  "conversation_history": [...],
  "system_prompt": null,
  "timezone": "America/New_York"
}

Streaming Response

Returns Server-Sent Events (SSE):
data: {"type": "content", "text": "Based on..."}
data: {"type": "content", "text": " the analysis..."}
data: {"type": "search", "query": "competitor pricing 2024"}
data: {"type": "done", "system_prompt": "..."}

System Prompt Caching

  • First message: Builds system prompt with competitor context
  • Subsequent messages: Reuses cached system_prompt from response
  • Reduces latency and API costs

Internal Services

ServicePurpose
build_system_promptCreates context-aware prompt
stream_chatHandles streaming and web search

Code Location

src/app/apis/explore/chat/
├── routes.py
├── prompt_builder.py
└── children/
    └── stream_handler.py