Purpose
Streaming chat endpoint for business research. Uses OpenRouter (Cerebras) for fast inference with competitor context and real-time web search via Exa.Architecture
Request Body
Streaming Response
Returns Server-Sent Events (SSE):System Prompt Caching
- First message: Builds system prompt with competitor context
- Subsequent messages: Reuses cached
system_promptfrom response - Reduces latency and API costs
Internal Services
| Service | Purpose |
|---|---|
| build_system_prompt | Creates context-aware prompt |
| stream_chat | Handles streaming and web search |