Skip to main content

POST /api/cron/regenerate-fresh-website

Performs a FULL rebuild of an existing AI site using the new onboarding flow with Firecrawl agent for business info extraction.

Purpose

Use this endpoint when you want to:
  • Regenerate content with fresh LLM output
  • Fix issues with an existing AI site
  • Test changes to the generation pipeline
This is different from update-site which only does incremental updates when source website content changes.

Architecture

Key Change: Business info from Firecrawl agent is used for LLM content generation. Scraped pages are ONLY used for markdown replicas.

Request Body

FieldTypeRequiredDescription
business_idstringYesOrg slug (e.g., β€œthe-ai-teddy-bear-company-1767082986”)
urlstringNoSource URL (fetched from DB if not provided)
max_pagesintegerNoMax pages to scrape (default: 5000)

Pipeline Steps

StepServiceInputOutput
1adiscover_business_infourl, business_namebusiness_info dict
1bscrape_websiteurlpages[]
2fetch_and_hash_batchpage_urlspage_hashes
3organize_with_llm_from_business_infobusiness_infoorganized_data
4generate_ai_siteorganized_data, pages (for replicas)files[]
5deploy_to_vercelfilesdeployment_url
6assign_domaindeploymentai_site_url
7store_page_hashespage_hashes-
8submit_urls_to_indexnowurls-
Steps 1a and 1b run in parallel using asyncio.gather().

Content Sources

Content TypeSourceWhy
llms.txtbusiness_info (1a)Focused, structured business context
Q&A Pagesbusiness_info (1a)Clean Q&A from business understanding
data.jsonbusiness_info (1a)Accurate Schema.org from business context
Markdown Replicaspages (1b)1:1 copy of original website content

Response Fields

FieldTypeDescription
statusstring”success” or error
ai_site_urlstringDeployed AI site URL
source_urlstringSource website URL
business_idstringBusiness identifier
business_namestringBusiness name
pages_scrapedintegerNumber of pages scraped
files_generatedintegerNumber of files generated
pages_hashedintegerNumber of page hashes stored
qa_pagesintegerNumber of Q&A pages generated
replica_pagesintegerNumber of markdown replica pages

Example Request

curl -X POST http://localhost:8000/api/cron/regenerate-fresh-website \
  -H "Content-Type: application/json" \
  -H "X-API-Key: search-company" \
  -d '{
    "business_id": "the-ai-teddy-bear-company-1767082986",
    "url": "https://new-supreme-3.myshopify.com"
  }'

Code Location

src/app/apis/cron/regenerate_fresh_website/routes.py

Key Imports

from src.app.shared.discover_business_info import discover_business_info
from src.app.shared.ai_website import (
    organize_with_llm_from_business_info,
    generate_ai_site,
    deploy_to_vercel,
    assign_domain,
)