Internal Service: organize_with_llm_from_business_info
Uses an LLM to generate AI website content from business_info (Firecrawl agent output).
Key Change: This service now uses business_info from Firecrawl agent, NOT scraped pages. Scraped pages are only used for markdown replica generation.
Function Signature
async def organize_with_llm_from_business_info(
url: str,
business_info: dict,
business_id: str
) -> dict
Parameters
| Parameter | Type | Description |
|---|
url | str | Source website URL |
business_info | dict | Business info from Firecrawl agent |
business_id | str | Business identifier |
business_info Schema
{
"description": "2-3 sentence description of what the company does",
"products_services": "Overview of main products or services",
"target_market": "Who their target customers are",
"key_features": "Key features, capabilities, or differentiators",
"value_proposition": "The core value they provide to customers",
"business_name": "The AI Teddy Bear Company",
"url": "https://new-supreme-3.myshopify.com"
}
Three Parallel Gemini Calls
| Call | Prompt Builder | Output |
|---|
| 1 | build_llms_txt_prompt_from_business_info() | Markdown llms.txt content |
| 2 | build_index_html_prompt_from_business_info() | JSON with homepage + Q&A pages |
| 3 | build_data_json_prompt_from_business_info() | Schema.org JSON-LD |
Returns
{
"status": "success",
"organized_data": {
"business_name": "The AI Teddy Bear Company",
"meta_title": "The AI Teddy Bear Company - AI Profile",
"meta_description": "...",
"llms_txt": "# The AI Teddy Bear Company\n\n...",
"html_body": "<div>...</div>",
"qa_pages": [
{"slug": "what-is-teddy-bear-ai", "title": "...", "content": "..."},
...
],
"schema_org": {"@context": "https://schema.org", ...}
},
"business_id": "the-ai-teddy-bear-company-1767082986",
"url": "https://new-supreme-3.myshopify.com",
"ai_url": "https://the-ai-teddy-bear-company-1767082986.searchcompany.dev"
}
Why business_info Instead of Pages?
| Approach | Pros | Cons |
|---|
| From business_info | Focused, structured input; fewer tokens; cleaner output | Requires Firecrawl agent call |
| From scraped pages | No extra API call | Noisy input; more tokens; less focused output |
The new approach produces more consistent, higher-quality LLM content because:
- Input is already structured (description, products, target market)
- No HTML/CSS noise to parse
- LLM can focus on content generation, not extraction
Code Location
src/app/shared/ai_website/llm_organize.py
Key Functions
# New onboarding path (uses business_info)
async def organize_with_llm_from_business_info(url, business_info, business_id)
async def generate_ai_content_from_business_info(url, ai_url, business_info, business_id)
# Legacy cron path (uses scraped pages)
async def organize_with_llm(url, scraped, business_id)
async def generate_ai_content(url, ai_url, scraped, business_id)