Discover Products

Discovers NEW products from a Shopify store using hash-based change detection.

Batch 1b: This endpoint runs in parallel with Batch 1a (update-ai-site). It’s completely decoupled - no scraped content needed.

What This Endpoint Does

✅ Fetch products from Shopify /products.json
✅ Compare hash to detect changes
✅ Save NEW products to database
✅ Generate prompts for NEW products (calls generate_prompts_for_product internally)
✅ Generate /llms/{slug}.txt for NEW products (calls generate_product_llms_txt internally)
✅ Return files for combined deploy (when skip_deploy=True)

How It Works

Hash-Based Change Detection

Instead of re-processing all products every time, we use efficient change detection:

products_hash - MD5 hash of sorted product handles
products_snapshot - Full product list from last sync

Hash comparison:
- If hash unchanged → skip entirely (no work needed)
- If hash changed → compare snapshots to find NEW products only

This means we only process truly NEW products, not all products.

Request Body

Field	Type	Required	Description
`business_id`	string	Yes	Clerk org ID
`skip_deploy`	boolean	No	If true, return files instead of deploying (default: false)

Response

{
  "status": "success",
  "products_checked": 25,
  "new_products": [
    {
      "name": "Product A",
      "entity_id": "uuid-123"
    }
  ],
  "prompts_generated": 10,
  "llms_files_deployed": 1,
  "project_name": "business-ai-site",
  "ai_site_url": "https://business.searchcompany.dev",
  "source_url": "https://mystore.com",
  "llms_files": [...]  // Only if skip_deploy=true
}

Database Updates

Table	Action
`entities`	Insert new products with `type: "product"`
`prompts`	Insert 10 prompts per new product
`ai_sites`	Update `products_hash` and `products_snapshot`

Database Columns Used (ai_sites)

Column	Type	Purpose
`products_hash`	TEXT	MD5 hash of sorted product handles for quick comparison
`products_snapshot`	JSONB	Full product list from last sync `[{handle, name, ...}]`

Code Location

src/app/apis/cron/discover_products/routes.py
src/app/shared/products/discover.py  # Core Shopify product fetching logic
src/app/shared/products/generate_llms_txt.py  # LLMs generation

Internal Services Called

This endpoint internally calls these shared services:

Service	Purpose
`fetch_shopify_products`	Fetch all products from Shopify API
`save_product`	Save new product to entities table
`generate_prompts_for_product`	Generate 10 visibility prompts
`generate_product_llms_txt`	Generate `/llms/{slug}.txt` content

Why all-in-one? For the cron job, we want a single endpoint that handles the entire product discovery flow. The separate endpoints (generate-product-prompts, generate-product-llms-txt) exist for manual triggering and debugging.

Getting Started

Website

Onboarding

Cron

Your Current Setup

Explore

Settings - Toggle

Settings - Business

Settings - Team

Settings - Billing

Settings - Domain

Webhooks

Health

Manual Trigger

What This Endpoint Does

How It Works

Hash-Based Change Detection

Request Body

Response

Database Updates

Database Columns Used (ai_sites)

Code Location

Internal Services Called

Getting Started

Website

Onboarding

Cron

Your Current Setup

Explore

Settings - Toggle

Settings - Business

Settings - Team

Settings - Billing

Settings - Domain

Webhooks

Health

Manual Trigger

​What This Endpoint Does

​How It Works

​Hash-Based Change Detection

​Request Body

​Response

​Database Updates

​Database Columns Used (ai_sites)

​Code Location

​Internal Services Called

What This Endpoint Does

How It Works

Hash-Based Change Detection

Request Body

Response

Database Updates

Database Columns Used (ai_sites)

Code Location

Internal Services Called