POST /api/cron/regenerate-fresh-website

Performs a FULL rebuild of an existing AI site using the new onboarding flow with Firecrawl agent for business info extraction.

Purpose

Use this endpoint when you want to:

Regenerate content with fresh LLM output
Fix issues with an existing AI site
Test changes to the generation pipeline

This is different from update-site which only does incremental updates when source website content changes.

Architecture

Key Change: Business info from Firecrawl agent is used for LLM content generation. Scraped pages are ONLY used for markdown replicas.

Request Body

Field	Type	Required	Description
`business_id`	string	Yes	Org slug (e.g., “the-ai-teddy-bear-company-1767082986”)
`url`	string	No	Source URL (fetched from DB if not provided)
`max_pages`	integer	No	Max pages to scrape (default: 5000)

Pipeline Steps

Step	Service	Input	Output
1a	`discover_business_info`	url, business_name	business_info dict
1b	`scrape_website`	url	pages[]
2	`fetch_and_hash_batch`	page_urls	page_hashes
3	`organize_with_llm_from_business_info`	business_info	organized_data
4	`generate_ai_site`	organized_data, pages (for replicas)	files[]
5	`deploy_to_vercel`	files	deployment_url
6	`assign_domain`	deployment	ai_site_url
7	`store_page_hashes`	page_hashes	-
8	`submit_urls_to_indexnow`	urls	-

Steps 1a and 1b run in parallel using asyncio.gather().

Content Sources

Content Type	Source	Why
llms.txt	business_info (1a)	Focused, structured business context
Q&A Pages	business_info (1a)	Clean Q&A from business understanding
data.json	business_info (1a)	Accurate Schema.org from business context
Markdown Replicas	pages (1b)	1:1 copy of original website content

Response Fields

Field	Type	Description
`status`	string	”success” or error
`ai_site_url`	string	Deployed AI site URL
`source_url`	string	Source website URL
`business_id`	string	Business identifier
`business_name`	string	Business name
`pages_scraped`	integer	Number of pages scraped
`files_generated`	integer	Number of files generated
`pages_hashed`	integer	Number of page hashes stored
`qa_pages`	integer	Number of Q&A pages generated
`replica_pages`	integer	Number of markdown replica pages

Example Request

curl -X POST http://localhost:8000/api/cron/regenerate-fresh-website \
  -H "Content-Type: application/json" \
  -H "X-API-Key: search-company" \
  -d '{
    "business_id": "the-ai-teddy-bear-company-1767082986",
    "url": "https://new-supreme-3.myshopify.com"
  }'

Code Location

src/app/apis/cron/regenerate_fresh_website/routes.py

Key Imports

from src.app.shared.discover_business_info import discover_business_info
from src.app.shared.ai_website import (
    organize_with_llm_from_business_info,
    generate_ai_site,
    deploy_to_vercel,
    assign_domain,
)

Getting Started

Website

Onboarding

Cron

Your Current Setup

Explore

Settings - Toggle

Settings - Business

Settings - Team

Settings - Billing

Settings - Domain

Webhooks

Health

Manual Trigger

Overview

POST /api/cron/regenerate-fresh-website

Purpose

Architecture

Request Body

Pipeline Steps

Content Sources

Response Fields

Example Request

Code Location

Key Imports

Getting Started

Website

Onboarding

Cron

Your Current Setup

Explore

Settings - Toggle

Settings - Business

Settings - Team

Settings - Billing

Settings - Domain

Webhooks

Health

Manual Trigger

​POST /api/cron/regenerate-fresh-website

​Purpose

​Architecture

​Request Body

​Pipeline Steps

​Content Sources

​Response Fields

​Example Request

​Code Location

​Key Imports

POST /api/cron/regenerate-fresh-website

Purpose

Architecture

Request Body

Pipeline Steps

Content Sources

Response Fields

Example Request

Code Location

Key Imports