🔵 POST - Regenerate Fresh Website
Regenerate Fresh Website
Fully regenerate an AI site from scratch using the new onboarding flow
POST
Performs a complete regeneration of an existing AI site using the new onboarding flow with Firecrawl agent for business info extraction.
When to Use
- Regenerate content with fresh LLM output
- Fix issues with an existing AI site
- Test changes to the generation pipeline
update-site which only does incremental updates when source website content changes.
New Architecture
This endpoint now uses the same flow as onboarding:Key Change: Business info from Firecrawl agent is used for LLM content generation (llms.txt, Q&A pages, data.json). Scraped pages are ONLY used for markdown replica generation.
Request Body
The org_slug / business ID (e.g. “the-ai-teddy-bear-company-1767082986”)
Source website URL. Optional - will use URL from database if not provided.
Maximum pages to scrape. Default: 5000
Response
“success” or “error”
The AI site URL (unchanged from before)
The business name from database
Number of pages scraped from source
Number of files generated
Number of Q&A pages generated
Number of markdown replica pages generated
Process Steps
| Step | Action | Details |
|---|---|---|
| 1a | Discover Business Info | Firecrawl agent extracts: description, products_services, target_market, key_features, value_proposition |
| 1b | Scrape Website | Custom mapper + Firecrawl batch scrape for markdown replicas |
| 2 | Hash Pages | Raw HTML hashing for future change detection |
| 3 | LLM Organize | Three parallel Gemini calls using business_info (not pages) |
| 4 | Generate Files | Create all files including markdown replicas from pages |
| 5 | Deploy | Push to existing Vercel project |
| 6 | Assign Domain | Update domain records if needed |
| 7 | Store Hashes | Save page hashes for change detection |
| 8 | IndexNow | Submit all URLs for instant indexing |
Step 1a and 1b run in parallel for faster execution. The rest runs sequentially.
Content Generation Sources
| Content | Source | Why |
|---|---|---|
| llms.txt | Business Info (Firecrawl) | Focused, structured business context |
| Q&A Pages | Business Info (Firecrawl) | Clean Q&A from business understanding |
| data.json | Business Info (Firecrawl) | Accurate Schema.org from business context |
| Markdown Replicas | Scraped Pages | 1:1 copy of original website content |
Files Generated
LLM-Generated Content (from business_info)
| File | Source | Purpose |
|---|---|---|
public/llms.txt | Gemini + business_info | Primary AI-readable content |
pages/index.js | Gemini + business_info | Homepage with Q&A structure |
public/data.json | Gemini + business_info | Schema.org structured data |
Q&A Pages (from business_info)
| File Pattern | Purpose |
|---|---|
pages/{slug}.js | AI-generated Q&A pages (e.g., /what-is-teddy-bear-ai) |
Markdown Replica Pages (from scraped pages)
| File Pattern | Purpose |
|---|---|
pages/{path}.js | Next.js page for each scraped URL |
public/markdown/{path}.md | Markdown content for each scraped URL |
Static Templates
| File | Purpose |
|---|---|
public/robots.txt | Crawler permissions (allows all bots) |
public/sitemap.xml | Site structure for search engines |
middleware.js | Edge middleware for tracking AI bot visits |
package.json | Next.js dependencies |
next.config.js | Next.js configuration |
public/search-company.txt | IndexNow key verification file |