Skip to main content
When we create an AI website for a business, we generate a complete site optimized for AI search engines like ChatGPT, Perplexity, Claude, and Gemini. This page explains what each file and section does.

Site Structure

ai-{business-slug}.searchcompany.dev/
β”œβ”€β”€ llms.txt                    # Primary AI-readable content
β”œβ”€β”€ data.json                   # Schema.org structured data
β”œβ”€β”€ robots.txt                  # Crawler permissions
β”œβ”€β”€ sitemap.xml                 # Site structure for crawlers
β”œβ”€β”€ {indexnow-key}.txt          # IndexNow verification
β”‚
β”œβ”€β”€ /                           # Homepage with FAQ links
β”œβ”€β”€ /what-is-{business}/        # Q&A page (example)
β”œβ”€β”€ /how-to-contact-{business}/ # Q&A page (example)
β”‚
β”œβ”€β”€ /about/                     # Markdown replica of real /about page
β”œβ”€β”€ /products/teddy-bear/       # Markdown replica of real product page
β”‚
β”œβ”€β”€ /llms/teddy-bear.txt        # Product-specific AI content for AI Article Context
β”œβ”€β”€ /llms/premium-widget.txt    # Product-specific AI content for AI Article Context
β”‚
└── /expert-review-of-{biz}/    # AI Article (added by cron)

Core Files

/llms.txt - Primary AI Content

Purpose: The main file AI search engines read to understand your business. Think of it as a comprehensive β€œabout us” document written specifically for AI.
What it contains:
  • Business name and one-sentence description
  • Comprehensive overview (2-3 paragraphs)
  • Products/services with brief descriptions
  • Key details (location, contact, hours, etc.)
  • 5-10 FAQs about the business with detailed answers
  • Links to product-specific llms files
Format: Plain text markdown, optimized for LLM parsing. Example structure:
# Acme Corp

> Acme Corp is a leading provider of innovative widgets for enterprise customers.

## Overview

[2-3 paragraphs about the business...]

## Products & Services

### Enterprise Solutions

- **Widget Pro:** Enterprise-grade widget with advanced features
- **Widget Lite:** Lightweight solution for small teams

## Key Details

- Location: San Francisco, CA
- Founded: 2015
- Contact: hello@acme.com

## Frequently Asked Questions - Acme Corp - About

**Q: What is Acme Corp?**
A: [Detailed 2-3 paragraph answer...]

/data.json - Schema.org Structured Data

Purpose: Machine-readable structured data that helps AI and search engines understand the business type, offerings, and key information.
What it contains:
  • @type - The most specific Schema.org type (Organization, LocalBusiness, SoftwareApplication, etc.)
  • Business name, description, URL
  • Contact information
  • Location/address (if applicable)
  • Products/services catalog
  • Social media links
Example:
{
  "@context": "https://schema.org",
  "@type": "SoftwareApplication",
  "name": "Acme Corp",
  "description": "Enterprise widget solutions",
  "url": "https://acme.com",
  "applicationCategory": "BusinessApplication",
  "offers": {
    "@type": "Offer",
    "price": "99.00",
    "priceCurrency": "USD"
  }
}

/robots.txt - Crawler Permissions

Purpose: Tells web crawlers (including AI bots) they’re allowed to access all content, and points them to the sitemap.
Contents:
User-agent: *
Allow: /

Sitemap: https://customer-domain.com/sitemap.xml

/sitemap.xml - Site Structure

Purpose: Lists all pages on the AI site so crawlers can discover and index everything. Updated whenever new pages are added.
Includes:
  • Homepage
  • All Q&A pages
  • All markdown replica pages
  • All product llms files (/llms/{slug}.txt)
  • All AI articles (added by cron)

/{indexnow-key}.txt - IndexNow Verification

Purpose: Verification file for IndexNow protocol, which enables instant notification to Bing, Yandex, and other search engines when content changes.

Page Types

Homepage (/)

The homepage serves as a navigation hub with:
  1. Business name as H1
  2. Resources section - Links to llms.txt, data.json, robots.txt, sitemap.xml
  3. FAQ sections - Organized by category, each question links to its dedicated page
Example:
<h1>Acme Corp</h1>

<h2>Resources</h2>
<ul>
  <li><a href="/llms.txt">llms.txt</a> - Primary AI-readable content</li>
  <li><a href="/data.json">data.json</a> - Schema.org structured data</li>
</ul>

<h2>Frequently Asked Questions</h2>
<h3>Acme Corp - About</h3>
<ul>
  <li><a href="/what-is-acme-corp">What is Acme Corp?</a></li>
  <li><a href="/how-to-contact-acme">How do I contact Acme?</a></li>
</ul>

Q&A Pages (/{slug}/)

Purpose: Each FAQ gets its own dedicated page at the root level for maximum SEO authority. AI search engines can link directly to specific answers.
Structure:
  • Meta title: β€œWhat is Acme Corp? | Acme Corp”
  • Meta description: Direct answer summary
  • Full detailed answer (2-3 paragraphs)
  • Proper Schema.org FAQPage markup
Why separate pages? AI search engines often cite specific URLs. Having dedicated pages for each question means they can link directly to the authoritative answer.

Markdown Replica Pages (/{original-path}/)

Purpose: Exact copies of the real website’s pages, converted to clean markdown HTML. This gives AI crawlers easy access to all your content.
Example:
  • Real site: https://acme.com/about β†’ AI site: /about/
  • Real site: https://acme.com/products/widget-pro β†’ AI site: /products/widget-pro/
What they contain:
  • The scraped markdown content from the original page
  • Proper meta tags and canonical URLs pointing to the real site
  • Clean, AI-readable formatting
Why replicas? AI search engines can struggle with complex JavaScript sites. The markdown replicas provide clean, easily-parsed versions of all your content.

Product LLMs Files (/llms/{product-slug}.txt)

Purpose: Dedicated AI content files for each product. Keeps the main llms.txt small while providing detailed product information for AI queries.
What they contain:
  • Product name and one-sentence description
  • Detailed overview (what it does, who it’s for)
  • Key features list
  • Pricing information
  • Best use cases
  • Product-specific FAQs
Example structure:
# Widget Pro

> Enterprise-grade widget solution with advanced analytics and integrations.

_A product by Acme Corp_

## Overview

Widget Pro is designed for enterprise teams who need...

## Key Features

- **Real-time Analytics:** Track widget performance...
- **API Integrations:** Connect with 50+ tools...

## Pricing

Starting at $99/month. Enterprise plans available.

## Frequently Asked Questions - Widget Pro

**Q: What is Widget Pro?**
A: Widget Pro is Acme Corp's flagship product...

AI Articles (/{slug}/)

Purpose: SEO-optimized content pages generated by the daily cron job to improve AI discoverability. These are NOT replicas - they’re new content.
Added by: Daily cron job (Batch 2a) Weekly target: 100 pages per week
  • 50 pages about the business
  • 50 pages distributed across products
Example titles:
  • β€œExpert Review of Acme Corp’s Widget Solutions”
  • β€œDeep Dive into Widget Pro Features”
  • β€œHow Acme Corp Compares to Competitors”
Why AI articles? They provide additional entry points for AI search engines to discover and recommend your business for relevant queries.

How Files Are Generated

FileGenerated ByWhen
llms.txtGemini 3 FlashOnboarding
data.jsonGemini 3 FlashOnboarding
robots.txtStatic templateOnboarding
sitemap.xmlStatic templateOnboarding + Updates
HomepageGemini 3 FlashOnboarding
Q&A pagesGemini 3 FlashOnboarding
Markdown replicasScraped contentOnboarding
Product llmsGemini 3 FlashOnboarding + Cron
AI articlesGemini 3 FlashDaily cron

Update Flow

  1. Onboarding: All core files are generated and deployed
  2. Daily Cron (Batch 1a): Detects changes on real site, updates llms.txt, Q&A pages, and replicas
  3. Daily Cron (Batch 1b): Discovers new products, generates product llms files
  4. Daily Cron (Batch 2a): Creates new AI articles, updates timestamps on all pages
  5. Daily Cron (Batch 3): Notifies search engines of all changes via IndexNow
All files use the customer’s real domain as the canonical URL, so search engines attribute the content to the original site, not the AI subdomain.