> ## Documentation Index
> Fetch the complete documentation index at: https://docs.searchcompany.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Overview

> Architecture for the Update AI Site endpoint

## Purpose

Updates an AI website with content changes detected by `detect-changes`. Uses **three parallel Gemini calls** (same pattern as onboarding) for consistency and better results.

## Architecture

```mermaid theme={null}
flowchart TD
    Request["POST /api/cron/update-ai-site"]
    
    subgraph fetch [Fetch Current Files]
        FetchLLMs["Fetch llms.txt"]
        FetchHTML["Fetch homepage HTML"]
        FetchData["Fetch data.json"]
    end
    
    subgraph gemini [3 Parallel Gemini Calls]
        UpdateLLMs["Update llms.txt"]
        UpdateQA["Update Q&A pages"]
        UpdateSchema["Update data.json"]
    end
    
    subgraph generate [Generate Files]
        GenFiles["generate_ai_site"]
        GenReplica["Replica pages for new+changed only"]
    end
    
    subgraph deploy [Deploy]
        Vercel["Deploy to Vercel"]
        StoreHashes["Store updated hashes"]
    end
    
    Request --> fetch
    fetch --> gemini
    gemini --> generate
    generate --> deploy
    deploy --> Response
```

## Three Parallel Gemini Calls

| Call | Input                        | Output                 | Purpose              |
| ---- | ---------------------------- | ---------------------- | -------------------- |
| 1    | Current llms.txt + changes   | Updated llms.txt       | AI-readable content  |
| 2    | Current homepage + changes   | Updated Q\&A structure | Homepage + FAQ pages |
| 3    | Current Schema.org + changes | Updated JSON-LD        | Structured data      |

All three calls run **in parallel** using `asyncio.gather()` for speed.

## Request Body

Takes the output from `detect-changes`:

```json theme={null}
{
  "business_id": "org-slug-123",
  "new_pages": [...],        // Have markdown
  "changed_pages": [...],    // Have markdown
  "removed_urls": [...],
  "unchanged_pages": [...],  // NO markdown
  "updated_site_map": [...],
  "updated_hashes": {...},
  "business_info": {...}
}
```

## What Gets Updated

| File                    | When Updated             | Generated By  |
| ----------------------- | ------------------------ | ------------- |
| `llms.txt`              | Content changes detected | Gemini call 1 |
| `pages/index.js`        | Content changes detected | Gemini call 2 |
| `data.json`             | Content changes detected | Gemini call 3 |
| `pages/*.js` (replicas) | New or changed pages     | Deterministic |
| `sitemap.xml`           | New or removed pages     | Deterministic |
| `robots.txt`            | Never (static)           | Deterministic |

## Efficiency

Only `new_pages` and `changed_pages` have markdown content (from Firecrawl batch scrape).
`unchanged_pages` are NOT scraped - they keep their existing hashes.

This means:

* Gemini only processes changed content
* Replica pages only regenerated for changed pages
* Unchanged content is preserved

## Response Format

```json theme={null}
{
  "status": "success",
  "files_updated": 42,
  "changed_urls": ["/about", "/pricing"],
  "deployment_url": "https://..."
}
```

## Code Location

```
src/app/apis/cron/update_ai_site/
├── routes.py           # HTTP endpoint
└── mini_orchestrator.py # Update logic with 3 parallel Gemini calls
```

## Next Step

After update, call `submit-indexnow` with `changed_urls` to notify search engines.
