> ## Documentation Index
> Fetch the complete documentation index at: https://docs.searchcompany.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Update AI Site

Update an AI website with content changes from `detect-changes` API.

Uses **THREE parallel Gemini calls** (same pattern as onboarding):

1. Update `llms.txt`
2. Update homepage + Q\&A pages
3. Update `data.json` (Schema.org)

<Info>
  **Batch 1a**: This endpoint runs as part of Batch 1a along with `detect-changes`.
  Product discovery (Batch 1b) is completely decoupled and runs in parallel.
</Info>

The update flow:

1. `detect-changes` - Find what changed
2. **update-ai-site** (this endpoint) - Update the AI website

## How It Works

```
1. Fetch current AI site files (llms.txt, homepage HTML, data.json)
2. Run 3 PARALLEL Gemini calls:
   - llms.txt: "Here's current + changes, update if needed"
   - Q&A pages: "Here's current + changes, update if needed"
   - data.json: "Here's current + changes, update if needed"
3. Check if any product source URLs changed → regenerate affected product llms
4. Generate files via generate_ai_site (only new + changed replica pages)
5. Deploy to Vercel (unless skip_deploy=True)
6. Store updated site_map + page_hashes
7. Return changed URLs for IndexNow submission (+ files if skip_deploy=True)
```

## Product LLMs Regeneration

When pages change, this endpoint checks if any **existing products** have `product_source_urls` that overlap with the changed URLs. If so, those product llms files are regenerated.

```
Products in DB:
- Sandals: product_source_urls = ["/products/sandals", "/categories/footwear"]
- Boots: product_source_urls = ["/products/boots", "/winter-collection"]

Changed URLs from detect_changes:
- ["/products/sandals", "/about"]

Result:
- Sandals /llms/sandals.txt → REGENERATE (overlap: /products/sandals)
- Boots /llms/boots.txt → SKIP (no overlap)
```

<Note>
  **Efficient updates**: Product llms files are only regenerated if their source content changed. This scales well for sites with hundreds of products.
</Note>

<Warning>
  **Partial content limitation**: When regenerating product llms, we only use content from the changed pages. If a product has 3 source URLs and only 1 changed, the regenerated file will be based on that 1 page's content. For a complete regeneration with all source content, use `regenerate-fresh-website`.
</Warning>

## Request Body

Takes the output from `detect-changes` directly:

| Field              | Type    | Required | Description                                                 |
| ------------------ | ------- | -------- | ----------------------------------------------------------- |
| `business_id`      | string  | Yes      | Clerk org ID                                                |
| `new_pages`        | array   | Yes      | Pages that didn't exist before (have markdown)              |
| `changed_pages`    | array   | Yes      | Pages with content changes (have markdown)                  |
| `removed_urls`     | array   | Yes      | URLs that no longer exist                                   |
| `unchanged_pages`  | array   | Yes      | Unchanged pages (NO markdown)                               |
| `updated_site_map` | array   | Yes      | Current URL list                                            |
| `updated_hashes`   | object  | Yes      | Current hash map                                            |
| `business_info`    | object  | Yes      | Business entity and AI site info                            |
| `skip_deploy`      | boolean | No       | If true, return files instead of deploying (default: false) |

<Note>
  **Key difference from old API**: `unchanged_pages` replaces `all_pages`.
  Unchanged pages don't have markdown because they weren't batch scraped.
</Note>

## Response

```json theme={null}
{
  "status": "success",
  "files_updated": 42,
  "product_llms_regenerated": 2,
  "deployment_url": "https://example.searchcompany.dev",
  "project_name": "example-searchcompany-dev",
  "source_url": "https://example.com",
  "changed_urls": [
    "https://example.com/new-page",
    "https://example.com/about"
  ],
  "files": [...]  // Only present if skip_deploy=true
}
```

The `changed_urls` array should be submitted to IndexNow after deployment.

<Note>
  **skip\_deploy mode**: When `skip_deploy=True`, the endpoint returns generated files
  in the response instead of deploying. This is used by the cron orchestrator to
  collect files from Batch 1a and 1b, then perform a single combined deployment.
</Note>

## Three Parallel Gemini Calls

Each call receives the current file + changes and decides what to update:

### 1. Update llms.txt

```
Current llms.txt: [current content]
Changes: [new/changed/removed pages]
Task: Update if needed, preserve AI Article sections
```

### 2. Update Q\&A Pages

```
Current homepage HTML: [current structure]
Changes: [new/changed/removed pages]
Task: Update homepage + Q&A pages if needed, preserve AI Article sections
```

### 3. Update data.json

```
Current Schema.org: [current JSON-LD]
Changes: [new/changed/removed pages]
Task: Update if needed
```

<Warning>
  **AI Article sections are preserved** - The Gemini prompts explicitly instruct NOT to modify any AI Article sections, as those are managed separately by the AI articles cron.
</Warning>

## Database Updates

* `ai_sites.page_hashes` - Updated hash map
* `ai_sites.site_map` - Updated URL list
* `ai_sites.last_content_check_at` - Timestamp

## External API Calls

* Gemini 3 Flash Preview (x3 parallel) - Content updates
* Gemini 3 Flash Preview (per affected product) - Product llms regeneration
* Vercel API - Deployment (unless `skip_deploy=True`)
