Skip to main content
Streaming chat endpoint for multi-business research. Uses OpenRouter (Cerebras) for fast inference with competitor context in the system prompt. The response is streamed via Server-Sent Events (SSE), allowing real-time token delivery to the frontend.

Request Body

FieldTypeRequiredDescription
messagestringYesThe user’s message/question
selected_competitor_idsstring[]NoUUIDs of competitors to include in context
conversation_historyarrayNoPrevious messages for context
system_promptstringNoCached system prompt from first message (improves performance)
timezonestringNoUser’s timezone (e.g., “Asia/Singapore”) for date formatting

Conversation History Format

[
  {"role": "user", "content": "What makes Nike successful?"},
  {"role": "assistant", "content": "Nike's success comes from..."}
]

System Prompt Caching

For better performance, the frontend should:
  1. On the first message, omit system_prompt and include timezone
  2. The backend returns a system_prompt event with the generated prompt
  3. On subsequent messages, pass back the cached system_prompt
This avoids rebuilding the system prompt (which queries the database) on every message.

Example Request

curl -X POST https://searchcompany-main.up.railway.app/api/explore/chat \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -H "Accept: text/event-stream" \
  -d '{
    "message": "Compare Nike and Adidas marketing strategies",
    "selected_competitor_ids": ["uuid1", "uuid2"],
    "conversation_history": []
  }'

Response Format (SSE)

The response is a stream of Server-Sent Events. Each event contains a JSON payload:

System Prompt Event (first message only)

Sent on the first message to allow frontend caching:
{"type": "system_prompt", "content": "You are Abra, a helpful AI assistant..."}

Token Event

Sent for each generated token:
{"type": "token", "content": "Nike"}

Search Event

Sent when the LLM requests a web search:
{"type": "search", "query": "Nike marketing strategy 2024"}

Search Results Event

Sent after search completes:
{"type": "search_results", "count": 5}

Done Event

Sent when generation completes:
{"type": "done", "usage": {"prompt_tokens": 150, "completion_tokens": 200}, "finish_reason": "stop"}

Error Event

Sent on error:
{"type": "error", "error": "Error message"}

Example Response Stream

data: {"type": "token", "content": "Nike"}
data: {"type": "token", "content": " focuses"}
data: {"type": "token", "content": " on"}
data: {"type": "search", "query": "Nike marketing strategy 2024"}
data: {"type": "search_results", "count": 5}
data: {"type": "token", "content": "Based on recent data..."}
data: {"type": "done", "usage": {"prompt_tokens": 500, "completion_tokens": 300}}

Features

Competitor Context

Selected competitors are included in the system prompt, giving the LLM access to:
  • Company name and URL
  • Description
  • Scraped content (key information about the business)
When the LLM needs current information, it can request a web search. The search is performed via Exa API, and results are injected into the conversation.

Conversation History

Include previous messages to maintain context across multiple turns.

Frontend Integration (JavaScript)

const response = await fetch('/api/explore/chat', {
  method: 'POST',
  headers: {
    'Authorization': `Bearer ${token}`,
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    message: 'Compare Nike and Adidas',
    selected_competitor_ids: ['uuid1', 'uuid2'],
  }),
});

const reader = response.body.getReader();
const decoder = new TextDecoder();

while (true) {
  const { done, value } = await reader.read();
  if (done) break;
  
  const chunk = decoder.decode(value);
  const lines = chunk.split('\n');
  
  for (const line of lines) {
    if (line.startsWith('data: ')) {
      const data = JSON.parse(line.slice(6));
      
      if (data.type === 'token') {
        // Append token to response
        console.log(data.content);
      } else if (data.type === 'search') {
        // Show search indicator
        console.log('Searching:', data.query);
      } else if (data.type === 'done') {
        // Generation complete
        console.log('Done:', data.usage);
      }
    }
  }
}