POST /api/cron/regenerate-fresh-website
Performs a FULL rebuild of an existing AI site using the new onboarding flow with Firecrawl agent for business info extraction.
Purpose
Use this endpoint when you want to:
- Regenerate content with fresh LLM output
- Fix issues with an existing AI site
- Test changes to the generation pipeline
This is different from update-site which only does incremental updates when source website content changes.
Architecture
Key Change: Business info from Firecrawl agent is used for LLM content generation. Scraped pages are ONLY used for markdown replicas.
Request Body
| Field | Type | Required | Description |
|---|
business_id | string | Yes | Org slug (e.g., βthe-ai-teddy-bear-company-1767082986β) |
url | string | No | Source URL (fetched from DB if not provided) |
max_pages | integer | No | Max pages to scrape (default: 5000) |
Pipeline Steps
| Step | Service | Input | Output |
|---|
| 1a | discover_business_info | url, business_name | business_info dict |
| 1b | scrape_website | url | pages[] |
| 2 | fetch_and_hash_batch | page_urls | page_hashes |
| 3 | organize_with_llm_from_business_info | business_info | organized_data |
| 4 | generate_ai_site | organized_data, pages (for replicas) | files[] |
| 5 | deploy_to_vercel | files | deployment_url |
| 6 | assign_domain | deployment | ai_site_url |
| 7 | store_page_hashes | page_hashes | - |
| 8 | submit_urls_to_indexnow | urls | - |
Steps 1a and 1b run in parallel using asyncio.gather().
Content Sources
| Content Type | Source | Why |
|---|
| llms.txt | business_info (1a) | Focused, structured business context |
| Q&A Pages | business_info (1a) | Clean Q&A from business understanding |
| data.json | business_info (1a) | Accurate Schema.org from business context |
| Markdown Replicas | pages (1b) | 1:1 copy of original website content |
Response Fields
| Field | Type | Description |
|---|
status | string | βsuccessβ or error |
ai_site_url | string | Deployed AI site URL |
source_url | string | Source website URL |
business_id | string | Business identifier |
business_name | string | Business name |
pages_scraped | integer | Number of pages scraped |
files_generated | integer | Number of files generated |
pages_hashed | integer | Number of page hashes stored |
qa_pages | integer | Number of Q&A pages generated |
replica_pages | integer | Number of markdown replica pages |
Example Request
curl -X POST http://localhost:8000/api/cron/regenerate-fresh-website \
-H "Content-Type: application/json" \
-H "X-API-Key: search-company" \
-d '{
"business_id": "the-ai-teddy-bear-company-1767082986",
"url": "https://new-supreme-3.myshopify.com"
}'
Code Location
src/app/apis/cron/regenerate_fresh_website/routes.py
Key Imports
from src.app.shared.discover_business_info import discover_business_info
from src.app.shared.ai_website import (
organize_with_llm_from_business_info,
generate_ai_site,
deploy_to_vercel,
assign_domain,
)