Internal Service: organize_with_llm_from_business_info

Uses an LLM to generate AI website content from business_info (Firecrawl agent output).

Key Change: This service now uses business_info from Firecrawl agent, NOT scraped pages. Scraped pages are only used for markdown replica generation.

Function Signature

async def organize_with_llm_from_business_info(
    url: str,
    business_info: dict,
    business_id: str
) -> dict

Parameters

Parameter	Type	Description
`url`	str	Source website URL
`business_info`	dict	Business info from Firecrawl agent
`business_id`	str	Business identifier

business_info Schema

{
  "description": "2-3 sentence description of what the company does",
  "products_services": "Overview of main products or services",
  "target_market": "Who their target customers are",
  "key_features": "Key features, capabilities, or differentiators",
  "value_proposition": "The core value they provide to customers",
  "business_name": "The AI Teddy Bear Company",
  "url": "https://new-supreme-3.myshopify.com"
}

Three Parallel Gemini Calls

Call	Prompt Builder	Output
1	`build_llms_txt_prompt_from_business_info()`	Markdown llms.txt content
2	`build_index_html_prompt_from_business_info()`	JSON with homepage + Q&A pages
3	`build_data_json_prompt_from_business_info()`	Schema.org JSON-LD

Returns

{
  "status": "success",
  "organized_data": {
    "business_name": "The AI Teddy Bear Company",
    "meta_title": "The AI Teddy Bear Company - AI Profile",
    "meta_description": "...",
    "llms_txt": "# The AI Teddy Bear Company\n\n...",
    "html_body": "<div>...</div>",
    "qa_pages": [
      {"slug": "what-is-teddy-bear-ai", "title": "...", "content": "..."},
      ...
    ],
    "schema_org": {"@context": "https://schema.org", ...}
  },
  "business_id": "the-ai-teddy-bear-company-1767082986",
  "url": "https://new-supreme-3.myshopify.com",
  "ai_url": "https://the-ai-teddy-bear-company-1767082986.searchcompany.dev"
}

Why business_info Instead of Pages?

Approach	Pros	Cons
From business_info	Focused, structured input; fewer tokens; cleaner output	Requires Firecrawl agent call
From scraped pages	No extra API call	Noisy input; more tokens; less focused output

The new approach produces more consistent, higher-quality LLM content because:

Input is already structured (description, products, target market)
No HTML/CSS noise to parse
LLM can focus on content generation, not extraction

Code Location

src/app/shared/ai_website/llm_organize.py

Key Functions

# New onboarding path (uses business_info)
async def organize_with_llm_from_business_info(url, business_info, business_id)
async def generate_ai_content_from_business_info(url, ai_url, business_info, business_id)

# Legacy cron path (uses scraped pages)
async def organize_with_llm(url, scraped, business_id)
async def generate_ai_content(url, ai_url, scraped, business_id)

Getting Started

Website

Onboarding

Cron

Your Current Setup

Explore

Settings - Toggle

Settings - Business

Settings - Team

Settings - Billing

Settings - Domain

Webhooks

Health

Manual Trigger

Organize with LLM

Internal Service: organize_with_llm_from_business_info

Function Signature

Parameters

business_info Schema

Three Parallel Gemini Calls

Returns

Why business_info Instead of Pages?

Code Location

Key Functions

Getting Started

Website

Onboarding

Cron

Your Current Setup

Explore

Settings - Toggle

Settings - Business

Settings - Team

Settings - Billing

Settings - Domain

Webhooks

Health

Manual Trigger

​Internal Service: organize_with_llm_from_business_info

​Function Signature

​Parameters

​business_info Schema

​Three Parallel Gemini Calls

​Returns

​Why business_info Instead of Pages?

​Code Location

​Key Functions

Internal Service: organize_with_llm_from_business_info

Function Signature

Parameters

business_info Schema

Three Parallel Gemini Calls

Returns

Why business_info Instead of Pages?

Code Location

Key Functions