Skip to main content

Internal Service: organize_with_llm_from_business_info

Uses an LLM to generate AI website content from business_info (Firecrawl agent output).
Key Change: This service now uses business_info from Firecrawl agent, NOT scraped pages. Scraped pages are only used for markdown replica generation.

Function Signature

async def organize_with_llm_from_business_info(
    url: str,
    business_info: dict,
    business_id: str
) -> dict

Parameters

ParameterTypeDescription
urlstrSource website URL
business_infodictBusiness info from Firecrawl agent
business_idstrBusiness identifier

business_info Schema

{
  "description": "2-3 sentence description of what the company does",
  "products_services": "Overview of main products or services",
  "target_market": "Who their target customers are",
  "key_features": "Key features, capabilities, or differentiators",
  "value_proposition": "The core value they provide to customers",
  "business_name": "The AI Teddy Bear Company",
  "url": "https://new-supreme-3.myshopify.com"
}

Three Parallel Gemini Calls

CallPrompt BuilderOutput
1build_llms_txt_prompt_from_business_info()Markdown llms.txt content
2build_index_html_prompt_from_business_info()JSON with homepage + Q&A pages
3build_data_json_prompt_from_business_info()Schema.org JSON-LD

Returns

{
  "status": "success",
  "organized_data": {
    "business_name": "The AI Teddy Bear Company",
    "meta_title": "The AI Teddy Bear Company - AI Profile",
    "meta_description": "...",
    "llms_txt": "# The AI Teddy Bear Company\n\n...",
    "html_body": "<div>...</div>",
    "qa_pages": [
      {"slug": "what-is-teddy-bear-ai", "title": "...", "content": "..."},
      ...
    ],
    "schema_org": {"@context": "https://schema.org", ...}
  },
  "business_id": "the-ai-teddy-bear-company-1767082986",
  "url": "https://new-supreme-3.myshopify.com",
  "ai_url": "https://the-ai-teddy-bear-company-1767082986.searchcompany.dev"
}

Why business_info Instead of Pages?

ApproachProsCons
From business_infoFocused, structured input; fewer tokens; cleaner outputRequires Firecrawl agent call
From scraped pagesNo extra API callNoisy input; more tokens; less focused output
The new approach produces more consistent, higher-quality LLM content because:
  • Input is already structured (description, products, target market)
  • No HTML/CSS noise to parse
  • LLM can focus on content generation, not extraction

Code Location

src/app/shared/ai_website/llm_organize.py

Key Functions

# New onboarding path (uses business_info)
async def organize_with_llm_from_business_info(url, business_info, business_id)
async def generate_ai_content_from_business_info(url, ai_url, business_info, business_id)

# Legacy cron path (uses scraped pages)
async def organize_with_llm(url, scraped, business_id)
async def generate_ai_content(url, ai_url, scraped, business_id)