> ## Documentation Index
> Fetch the complete documentation index at: https://docs.searchcompany.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Organize with LLM

> Internal service that organizes content using LLM from business_info

# Internal Service: organize\_with\_llm\_from\_business\_info

Uses an LLM to generate AI website content from **business\_info** (Firecrawl agent output).

<Info>
  **Key Change**: This service now uses `business_info` from Firecrawl agent, NOT scraped pages. Scraped pages are only used for markdown replica generation.
</Info>

## Function Signature

```python theme={null}
async def organize_with_llm_from_business_info(
    url: str,
    business_info: dict,
    business_id: str
) -> dict
```

## Parameters

| Parameter       | Type | Description                        |
| --------------- | ---- | ---------------------------------- |
| `url`           | str  | Source website URL                 |
| `business_info` | dict | Business info from Firecrawl agent |
| `business_id`   | str  | Business identifier                |

## business\_info Schema

```json theme={null}
{
  "description": "2-3 sentence description of what the company does",
  "products_services": "Overview of main products or services",
  "target_market": "Who their target customers are",
  "key_features": "Key features, capabilities, or differentiators",
  "value_proposition": "The core value they provide to customers",
  "business_name": "The AI Teddy Bear Company",
  "url": "https://new-supreme-3.myshopify.com"
}
```

## Three Parallel Gemini Calls

```mermaid theme={null}
flowchart LR
    BI[business_info] --> G1["Call 1: llms.txt"]
    BI --> G2["Call 2: Homepage + Q&A"]
    BI --> G3["Call 3: data.json"]
    
    G1 --> Result
    G2 --> Result
    G3 --> Result
```

| Call | Prompt Builder                                 | Output                          |
| ---- | ---------------------------------------------- | ------------------------------- |
| 1    | `build_llms_txt_prompt_from_business_info()`   | Markdown llms.txt content       |
| 2    | `build_index_html_prompt_from_business_info()` | JSON with homepage + Q\&A pages |
| 3    | `build_data_json_prompt_from_business_info()`  | Schema.org JSON-LD              |

## Returns

```json theme={null}
{
  "status": "success",
  "organized_data": {
    "business_name": "The AI Teddy Bear Company",
    "meta_title": "The AI Teddy Bear Company - AI Profile",
    "meta_description": "...",
    "llms_txt": "# The AI Teddy Bear Company\n\n...",
    "html_body": "<div>...</div>",
    "qa_pages": [
      {"slug": "what-is-teddy-bear-ai", "title": "...", "content": "..."},
      ...
    ],
    "schema_org": {"@context": "https://schema.org", ...}
  },
  "business_id": "the-ai-teddy-bear-company-1767082986",
  "url": "https://new-supreme-3.myshopify.com",
  "ai_url": "https://the-ai-teddy-bear-company-1767082986.searchcompany.dev"
}
```

## Why business\_info Instead of Pages?

| Approach                | Pros                                                    | Cons                                          |
| ----------------------- | ------------------------------------------------------- | --------------------------------------------- |
| **From business\_info** | Focused, structured input; fewer tokens; cleaner output | Requires Firecrawl agent call                 |
| **From scraped pages**  | No extra API call                                       | Noisy input; more tokens; less focused output |

The new approach produces more consistent, higher-quality LLM content because:

* Input is already structured (description, products, target market)
* No HTML/CSS noise to parse
* LLM can focus on content generation, not extraction

## Code Location

```
src/app/shared/ai_website/llm_organize.py
```

### Key Functions

```python theme={null}
# New onboarding path (uses business_info)
async def organize_with_llm_from_business_info(url, business_info, business_id)
async def generate_ai_content_from_business_info(url, ai_url, business_info, business_id)

# Legacy cron path (uses scraped pages)
async def organize_with_llm(url, scraped, business_id)
async def generate_ai_content(url, ai_url, scraped, business_id)
```
