🔵 POST - Generate All
Generate All
Backend orchestrator that runs all onboarding tasks in the background
POST
Overview
This is the main onboarding endpoint. It triggers all onboarding tasks and runs them in the background. The frontend can navigate away immediately after calling this endpoint.This endpoint returns immediately with status
"started". All tasks run asynchronously in the backend.Request Body
The Shopify store URL (e.g.,
https://mystore.com)The Clerk organization slug (e.g.,
my-business-abc123)The business name for display purposes
Response
Always
"started" on successHuman-readable status message
Example
Internal Services
The orchestrator calls these services directly (not via HTTP):GROUP 1: All Parallel Tasks
All GROUP 1 tasks run in parallel. None block each other.| Group | Service | Purpose |
|---|---|---|
| 1a | Discover Business Info | Uses Firecrawl Agent to extract what the company does |
| 1b | Scrape Website | Custom mapper + Firecrawl batch scrape. Returns pages for markdown replicas. |
| 1c | Discover Competitors | Uses Firecrawl Agent API to find up to 10 competitors |
| 1d | Discover Products | Fetches products from Shopify products.json API |
| 1e | Fetch Favicon | Downloads favicon, converts to PNG, uploads to storage |
| 1f | Materialize Score | Copies pre-payment ranking score to history table |
| 1g | Setup CloudFront | Creates CloudFront distribution for domain proxy |
GROUP 2: After 1a + 1b + 1d Complete (All Parallel)
| Group | Service | Purpose |
|---|---|---|
| 2a | Create AI Website | Uses business_info (1a) for llms.txt/Q&A/data.json; pages (1b) for markdown replicas |
| 2b | Product Prompts | Generates 5+ prompts per product (min 50 total) |
| 2c | Generate Product LLMs | Generates /llms/{product-slug}.txt files for each product |
Key Architecture Change:
- GROUP 1a (Discover Business Info) uses Firecrawl Agent to extract business information
- This business info is used by GROUP 2a to generate llms.txt, Q&A pages, and data.json
- Scraped pages (GROUP 1b) are ONLY used for markdown replica generation
- This separation makes LLM content generation more focused and efficient
Prompt Generation Strategy
All prompts are now tied to products (no business-level prompts):| Metric | Value |
|---|---|
| Prompts per product | 5 (default) |
| Minimum total prompts | 50 during onboarding |
| New products (via cron) | 5 prompts each |
| Daily sampling | 10 prompts for visibility scoring |
Service Details
Discover Business Info Service (shared/discover_business_info/service.py)
- Calls Firecrawl Agent API with the business URL
- Extracts: description, products_services, target_market, key_features, value_proposition
- Returns structured dict for AI website generation
Discover Products Service (shared/products/discover.py)
- Fetches products from
{store_url}/products.json - Paginates through all pages of products
- Extracts product title, description, URL, handle, variants
- Filters out existing products
- Saves new products to entities table
- Returns products list for GROUP 2
Product Prompts Service (tasks/prompts.py)
- Calculates prompts per product to ensure minimum 50 total
- Generates prompts for each product using Gemini 3 Flash
- Saves to
entity_prompts_trackertable - Returns count of generated/saved prompts
AI Website Service (shared/ai_website/)
- Uses business_info from GROUP 1a for LLM content generation
- Runs 3 parallel Gemini calls for llms.txt, Q&A pages, data.json
- Uses scraped pages from GROUP 1b for markdown replica generation only
- Deploys to Vercel
- Assigns
*.searchcompany.devsubdomain
Prerequisites
Before calling this endpoint, you must:- Create a Clerk organization
- Call
POST /api/businessto create the entity
generate-all runs.