> ## Documentation Index
> Fetch the complete documentation index at: https://docs.searchcompany.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Chat

Streaming chat endpoint for multi-business research. Uses OpenRouter (Cerebras) for fast inference with competitor context in the system prompt.

The response is streamed via Server-Sent Events (SSE), allowing real-time token delivery to the frontend.

## Request Body

| Field                     | Type      | Required | Description                                                    |
| ------------------------- | --------- | -------- | -------------------------------------------------------------- |
| `message`                 | string    | Yes      | The user's message/question                                    |
| `selected_competitor_ids` | string\[] | No       | UUIDs of competitors to include in context                     |
| `conversation_history`    | array     | No       | Previous messages for context                                  |
| `system_prompt`           | string    | No       | Cached system prompt from first message (improves performance) |
| `timezone`                | string    | No       | User's timezone (e.g., "Asia/Singapore") for date formatting   |

### Conversation History Format

```json theme={null}
[
  {"role": "user", "content": "What makes Nike successful?"},
  {"role": "assistant", "content": "Nike's success comes from..."}
]
```

### System Prompt Caching

For better performance, the frontend should:

1. On the **first message**, omit `system_prompt` and include `timezone`
2. The backend returns a `system_prompt` event with the generated prompt
3. On **subsequent messages**, pass back the cached `system_prompt`

This avoids rebuilding the system prompt (which queries the database) on every message.

## Example Request

```bash theme={null}
curl -X POST https://searchcompany-main.up.railway.app/api/explore/chat \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -H "Accept: text/event-stream" \
  -d '{
    "message": "Compare Nike and Adidas marketing strategies",
    "selected_competitor_ids": ["uuid1", "uuid2"],
    "conversation_history": []
  }'
```

## Response Format (SSE)

The response is a stream of Server-Sent Events. Each event contains a JSON payload:

### System Prompt Event (first message only)

Sent on the first message to allow frontend caching:

```json theme={null}
{"type": "system_prompt", "content": "You are Abra, a helpful AI assistant..."}
```

### Token Event

Sent for each generated token:

```json theme={null}
{"type": "token", "content": "Nike"}
```

### Search Event

Sent when the LLM requests a web search:

```json theme={null}
{"type": "search", "query": "Nike marketing strategy 2024"}
```

### Search Results Event

Sent after search completes:

```json theme={null}
{"type": "search_results", "count": 5}
```

### Done Event

Sent when generation completes:

```json theme={null}
{"type": "done", "usage": {"prompt_tokens": 150, "completion_tokens": 200}, "finish_reason": "stop"}
```

### Error Event

Sent on error:

```json theme={null}
{"type": "error", "error": "Error message"}
```

## Example Response Stream

```
data: {"type": "token", "content": "Nike"}
data: {"type": "token", "content": " focuses"}
data: {"type": "token", "content": " on"}
data: {"type": "search", "query": "Nike marketing strategy 2024"}
data: {"type": "search_results", "count": 5}
data: {"type": "token", "content": "Based on recent data..."}
data: {"type": "done", "usage": {"prompt_tokens": 500, "completion_tokens": 300}}
```

## Features

### Competitor Context

Selected competitors are included in the system prompt, giving the LLM access to:

* Company name and URL
* Description
* Scraped content (key information about the business)

### Real-Time Web Search

When the LLM needs current information, it can request a web search. The search is performed via Exa API, and results are injected into the conversation.

### Conversation History

Include previous messages to maintain context across multiple turns.

## Frontend Integration (JavaScript)

```javascript theme={null}
const response = await fetch('/api/explore/chat', {
  method: 'POST',
  headers: {
    'Authorization': `Bearer ${token}`,
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    message: 'Compare Nike and Adidas',
    selected_competitor_ids: ['uuid1', 'uuid2'],
  }),
});

const reader = response.body.getReader();
const decoder = new TextDecoder();

while (true) {
  const { done, value } = await reader.read();
  if (done) break;
  
  const chunk = decoder.decode(value);
  const lines = chunk.split('\n');
  
  for (const line of lines) {
    if (line.startsWith('data: ')) {
      const data = JSON.parse(line.slice(6));
      
      if (data.type === 'token') {
        // Append token to response
        console.log(data.content);
      } else if (data.type === 'search') {
        // Show search indicator
        console.log('Searching:', data.query);
      } else if (data.type === 'done') {
        // Generation complete
        console.log('Done:', data.usage);
      }
    }
  }
}
```
