> ## Documentation Index
> Fetch the complete documentation index at: https://docs.searchcompany.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Discover Products

Discovers NEW products from a Shopify store using hash-based change detection.

<Info>
  **Batch 1b**: This endpoint runs in parallel with Batch 1a (update-ai-site).
  It's completely decoupled - no scraped content needed.
</Info>

## What This Endpoint Does

1. ✅ Fetch products from Shopify `/products.json`
2. ✅ Compare hash to detect changes
3. ✅ Save NEW products to database
4. ✅ Generate prompts for NEW products (calls `generate_prompts_for_product` internally)
5. ✅ Generate `/llms/{slug}.txt` for NEW products (calls `generate_product_llms_txt` internally)
6. ✅ Return files for combined deploy (when `skip_deploy=True`)

## How It Works

```mermaid theme={null}
flowchart TD
    A[POST /api/cron/discover-products] --> B[Fetch /products.json from Shopify]
    B --> C[Compute MD5 hash of product handles]
    C --> D{Hash changed?}
    D -->|No| E[Return unchanged - skip]
    D -->|Yes| F[Compare snapshots to find NEW products]
    F --> G[Save new products to entities table]
    G --> H[Generate prompts for each new product]
    H --> I[Generate /llms/slug.txt for new products]
    I --> J{skip_deploy?}
    J -->|Yes| K[Return files for combined deploy]
    J -->|No| L[Deploy to Vercel]
    K --> M[Update products_hash + products_snapshot]
    L --> M
```

## Hash-Based Change Detection

Instead of re-processing all products every time, we use efficient change detection:

1. **products\_hash** - MD5 hash of sorted product handles
2. **products\_snapshot** - Full product list from last sync

```
Hash comparison:
- If hash unchanged → skip entirely (no work needed)
- If hash changed → compare snapshots to find NEW products only
```

This means we only process truly NEW products, not all products.

## Request Body

| Field         | Type    | Required | Description                                                 |
| ------------- | ------- | -------- | ----------------------------------------------------------- |
| `business_id` | string  | Yes      | Clerk org ID                                                |
| `skip_deploy` | boolean | No       | If true, return files instead of deploying (default: false) |

## Response

```json theme={null}
{
  "status": "success",
  "products_checked": 25,
  "new_products": [
    {
      "name": "Product A",
      "entity_id": "uuid-123"
    }
  ],
  "prompts_generated": 10,
  "llms_files_deployed": 1,
  "project_name": "business-ai-site",
  "ai_site_url": "https://business.searchcompany.dev",
  "source_url": "https://mystore.com",
  "llms_files": [...]  // Only if skip_deploy=true
}
```

## Database Updates

| Table      | Action                                         |
| ---------- | ---------------------------------------------- |
| `entities` | Insert new products with `type: "product"`     |
| `prompts`  | Insert 10 prompts per new product              |
| `ai_sites` | Update `products_hash` and `products_snapshot` |

## Database Columns Used (ai\_sites)

| Column              | Type  | Purpose                                                  |
| ------------------- | ----- | -------------------------------------------------------- |
| `products_hash`     | TEXT  | MD5 hash of sorted product handles for quick comparison  |
| `products_snapshot` | JSONB | Full product list from last sync `[{handle, name, ...}]` |

## Code Location

```
src/app/apis/cron/discover_products/routes.py
src/app/shared/products/discover.py  # Core Shopify product fetching logic
src/app/shared/products/generate_llms_txt.py  # LLMs generation
```

## Internal Services Called

This endpoint internally calls these shared services:

| Service                        | Purpose                             |
| ------------------------------ | ----------------------------------- |
| `fetch_shopify_products`       | Fetch all products from Shopify API |
| `save_product`                 | Save new product to entities table  |
| `generate_prompts_for_product` | Generate 10 visibility prompts      |
| `generate_product_llms_txt`    | Generate `/llms/{slug}.txt` content |

<Note>
  **Why all-in-one?** For the cron job, we want a single endpoint that handles
  the entire product discovery flow. The separate endpoints (`generate-product-prompts`,
  `generate-product-llms-txt`) exist for manual triggering and debugging.
</Note>
