Skip to main content
The Onboarding API handles everything needed when a new business signs up. The backend orchestrates all tasks via a single endpoint - the frontend just triggers it and can navigate away.

Flow

When a user completes payment and onboarding:
FRONTEND:
1. POST /api/business                     → Create org metadata + entity
2. POST /api/onboarding/generate-all      → Trigger backend orchestrator (returns immediately)

BACKEND (runs in background):
├── GROUP 1 - All Parallel:
│   ├── 1a: Scrape website (pages used by GROUP 2)
│   ├── 1b: Discover Competitors (Firecrawl agent)
│   ├── 1c: Discover Products (Shopify products.json API)
│   ├── 1d: Fetch Favicon
│   ├── 1e: Materialize Score
│   └── 1f: Setup CloudFront

├── GROUP 2 - After Scrape (1a):
│   └── 2a: Create AI Website (uses scraped pages)

└── GROUP 3 - After Products (1c) - Parallel:
    ├── 3a: Product Prompts (5+ prompts per product, min 50 total)
    └── 3b: Generate Product LLMs (/llms/{product-slug}.txt files)
Optimized Flow: All GROUP 1 tasks run in parallel. GROUP 2 starts when scrape completes. GROUP 3 starts when products are discovered - it can start before GROUP 2! GROUP 3a and 3b run in parallel with each other.

Endpoints

The frontend calls these endpoints:
EndpointPurposeAuth Required
Product NamesFetch product names for scanning UINo
Create BusinessCreate org metadata + entityYes (JWT)
Generate AllBackend orchestrator - runs all onboarding tasksYes (JWT)
All other onboarding tasks are internal services called by generate-all. They are not exposed as HTTP endpoints.

Internal Services

The generate-all orchestrator calls these shared services directly (not via HTTP):
ServicePurposeLocation
ScrapingCustom mapper + Firecrawl batch scrapeshared/scraping/, shared/mapping/
AI WebsiteDeploy AI-optimized site to Vercelshared/ai_website/
PromptsGenerate visibility prompts with Geminishared/prompts/
ProductsDiscover products via Shopify APIshared/products/
CloudFrontCreate CloudFront distributionshared/cloudfront/
Content HasherStore page hashes for change detectionshared/content_hasher/
FaviconFetch & store favicon (onboarding-only)onboarding/generate_all/tasks/favicon.py
ScoringCopy pre-payment ranking score (onboarding-only)onboarding/generate_all/tasks/scoring.py
CompetitorsDiscover up to 10 competitors using Firecrawl agentonboarding/services/discover_competitors/
Services in shared/ are used by multiple modules (onboarding, cron, domain). Services in onboarding/generate_all/ are only used during onboarding.

What Gets Created

After onboarding completes, the business has:
AssetDescriptionCreated By
Org MetadataClerk org details in databasePOST /api/business
Business EntityEntity record in entities tablePOST /api/business
FaviconStored favicon URLFavicon Service
AI SiteAI-optimized website at *.searchcompany.devAI Website Service
Markdown Replica Pages1:1 markdown copies of source website pagesAI Website Service
Product EntitiesAuto-discovered products from ShopifyProducts Service
50+ Product Prompts5+ prompts per product (min 50 total)Product Prompts Service
Product LLMs Files/llms/{product-slug}.txt for each productProduct LLMs Service
Visibility ScoreInitial pre-payment scoreScoring Service
CloudFront DistributionPre-created proxy for DNS propagationCloudFront Service
CompetitorsUp to 10 auto-discovered competitorsCompetitors Service
The business entity is created by POST /api/business before generate-all is called. All other assets are created by the backend orchestrator running in the background.

Prompt Generation Strategy

All prompts are tied to products:
MetricValue
Prompts per product5 (default)
Minimum total prompts50 during onboarding
New products (via cron)5 prompts each
Daily sampling10 prompts for visibility scoring
If a store has fewer than 10 products, prompts per product is increased to ensure at least 50 total.

Regenerating Prompts

If prompts need to be regenerated for a product:
curl -X POST https://searchcompany-main.up.railway.app/api/cron/generate-prompts \
  -H "X-API-Key: your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "business_id": "my-business-abc123",
    "url": "https://mystore.com/products/my-product",
    "product_id": "product-entity-uuid",
    "product_name": "My Product"
  }'

Testing

Run all onboarding tests:
cd Backend
uv run pytest src/pytests/onboarding/test_generate_all.py -v -s