Skip to main content
These endpoints are called by the Cron service (Railway scheduled jobs) to perform background tasks.
Schedule: The cron runs daily at 10:00 PM Singapore Time (SGT) / 14:00 UTC.

3-Batch Architecture (Custom Mapper + Batch Scrape)

The cron uses a custom website mapper + Firecrawl Batch Scrape API for efficient change detection:
For each customer:

   BATCH 1: Update AI Site + Discover Products
   β”œβ”€β”€ detect_changes (Custom Mapper + Hashing + Batch Scrape)
   β”œβ”€β”€ update_ai_site (parallel with discover)
   β”œβ”€β”€ discover_products β†’ generate_product_prompts + generate_product_llms_txt (parallel)
   β”œβ”€β”€ Deploy to Vercel
   └── Wait 5s for edge propagation

   BATCH 2: Create Content & Analyze (parallel jobs)
   β”œβ”€β”€ Update Timestamps β†’ collect files
   β”œβ”€β”€ Create Boosted Pages β†’ collect files + new page URLs
   β”œβ”€β”€ Analyze Visibility (sampling) β†’ DB only
   └── Deploy to Vercel (combined files)

   BATCH 3: Notify Search Engines
   β”œβ”€β”€ Aggregate all changed URLs from BATCH 1 + BATCH 2
   β”œβ”€β”€ Submit to IndexNow (one call per business)
   └── Resubmit sitemap to Google Search Console (one call per business)
Why this architecture is efficient: - Custom Mapper combines sitemap, robots.txt, and HTML link extraction for comprehensive URL discovery [FREE] - Hashing Service fetches raw HTML and hashes to detect changes [FREE] - Batch Scrape only scrapes NEW + CHANGED pages (not all pages) [PAID] - Product discovery uses the same scraped content (no re-scrape) - Visibility sampling reduces API calls by 50% - 2 deploys per customer (Batch 1 + Batch 2) - BATCH 3 notifies search engines AFTER all content is deployed

The Five Jobs

JobNamePurposeEntitiesDeploys?
1Update AI Sites + Discover ProductsRefresh AI websites, find new productsBusiness onlyYes (Batch 1)
2Update TimestampsRefresh timestamps on all pages for freshness signalsBusiness onlyYes (Batch 2)
3Create Boosted PagesGenerate AI-specific content pages (100/week target)Business + ProductsYes (Batch 2)
4Analyze VisibilityCheck visibility across 8 AI platforms (sampled)Business + ProductsNo (DB)
5Notify Search EnginesSubmit URLs to IndexNow + Google Search ConsoleBusiness onlyNo (API calls)

Job 1: Update AI Sites + Discover Products

Uses the split API flow internally:
  1. detect_changes: Custom mapper + Hashing Service finds changes, Batch Scrape only for changed pages
  2. update_ai_site: Send changes to Gemini (3 parallel calls), regenerate files, deploy
  3. discover_products: Extract products from new/changed pages (no prompts/llms)
  4. generate_product_prompts: Generate 10 prompts per new product
  5. generate_product_llms_txt: Generate /llms/.txt for new products
Steps 2-5 run in parallel where possible via asyncio.gather.
Update Site Flow:
1. Custom Mapper β†’ Get current URL list (up to 5000 pages) [FREE]
2. Compare URLs vs stored site_map β†’ find new/removed pages
3. Hashing Service β†’ Fetch raw HTML + hash existing pages [FREE]
4. Hash comparison β†’ Find changed pages
5. Batch Scrape β†’ Get markdown for NEW + CHANGED pages only [PAID]
6. If changes detected:
   β”œβ”€β”€ update_ai_site: 3 parallel Gemini calls, deploy to Vercel
   └── discover_products β†’ generate_product_prompts + generate_product_llms_txt
7. Collect changed_urls for BATCH 3 submission
Products do not get their own AI website. They are tracked for visibility analysis and boosted page generation only.

Job 2: Update Timestamps

Refreshes timestamps on ALL pages (AI site core files + boosted pages) to signal freshness to AI search engines.
Updates on every page:
  • Meta tags: article:modified_time
  • Year in titles: β€œ2025” β†’ β€œ2026” (if year changed)
  • Footer: β€œLast updated: December 24, 2025”
This helps because Bing and other AI search engines favor fresh content and may include dates in citations.

Job 3: Create Boosted Pages

Generates AI-specific content pages at the root level (/{slug}/) to improve discoverability.

Weekly Target (Per Customer)

  • 100 pages per week (Monday-Sunday)
  • 50 pages for the business (50%)
  • 50 pages distributed across products (50%)
Special cases:
  • No products: Business gets all 100 pages
  • 1-50 products: All products included, 50 pages split evenly
  • 51+ products: Round-robin rotation selects 50 products per week
When there are more than 50 products, the system uses a rotating selection each week so all products eventually get coverage.

URL Structure

Boosted pages are deployed at the root level for maximum SEO authority:
Entity TypeURL PatternExample
Business/{slug}//expert-review-of-website-arena/
Product/{slug}//deep-dive-into-remix-tool/

Job 4: Analyze Visibility (Sampling Architecture)

Cost Optimization: We sample 10 prompts per day (prioritizing untested ones) instead of checking all prompts. This reduces API costs by ~50% while ensuring all prompts eventually get tested.

How It Works

  1. Sample 10 prompts from the org’s total pool (untested first, then random)
  2. Analyze each prompt across 8 AI platforms (80 API calls total)
  3. Store results with pass/fail per platform and update last_tested_at
  4. Update overall score with floor protection (never dips below previous high)

Prompt Limits

  • Business: 50 prompts (10 via Exa during onboarding + 40 via regular generation)
  • Products: 10 prompts each (unlimited products)

Pass/Fail Paradigm

Each prompt shows visibility status per platform:
  • true (βœ“): Entity was mentioned/recommended by this AI platform
  • false (βœ—): Entity was not found in the AI platform’s response
  • null (-): Not yet tested

The 8 AI Platforms

Each platform uses its native search capabilities, then Gemini 3 Flash provides unified evaluation:
  • ChatGPT - OpenAI Direct w/ Search
  • Claude - Anthropic Direct w/ Search
  • Gemini - GCP AI Studio Direct w/ Search
  • Perplexity - Sonar API
  • Copilot - Parallel Search API
  • DeepSeek - Firecrawl Search API
  • Grok - X.AI Direct w/ Search
  • Google AI - Serp API (AI Overview)

Job 5: Notify Search Engines (BATCH 3)

BATCH 3 runs AFTER all content is deployed (BATCH 1 + BATCH 2) to ensure search engines see the latest content.

How It Works

  1. Aggregate URLs from BATCH 1 (changed pages) and BATCH 2 (new boosted pages)
  2. Submit to IndexNow - Instant notification to Bing, Yandex, and other IndexNow-compatible engines
  3. Resubmit sitemap to Google Search Console - Signals Google to re-crawl the sitemap

URL Sources

SourceURLs Submitted
Job 1 (update_ai_site)changed_urls - new + changed pages from detect-changes
Job 3 (create_boosted_pages)New boosted page slugs (e.g., /{slug}/)

APIs Called

# IndexNow - one call per business
POST /api/cron/submit-indexnow
{
  "urls": ["/about-us/", "/new-boosted-page/", ...],
  "source_url": "https://customer-domain.com"
}

# Google Search Console - one call per business
POST /api/domain/resubmit-sitemap/{org_id}
Why BATCH 3 is separate: Search engines should only be notified AFTER content is deployed. If we submitted URLs before deployment, crawlers might hit 404s or stale content.

All Endpoints

EndpointMethodDescription
/api/cron/entitiesGETFetch all businesses/products to process
/api/cron/detect-changesPOSTDetect content changes using Mapper + Hashing + Batch Scrape
/api/cron/update-ai-sitePOSTUpdate AI website with changes
/api/cron/discover-productsPOSTExtract products from content (no prompts/llms)
/api/cron/generate-product-promptsPOSTGenerate visibility prompts for products
/api/cron/generate-product-llms-txtPOSTGenerate product llms.txt files
/api/cron/update-all-timestampsPOSTRefresh timestamps on all AI website pages
/api/cron/boosted-pages-quotaGETCalculate today’s boosted pages quota
/api/cron/create-boosted-pagePOSTGenerate boosted page content (no deploy)
/api/cron/deploy-to-vercelPOSTDeploy all files to Vercel (single deployment)
/api/cron/submit-indexnowPOSTNotify search engines of new URLs
/api/cron/sample-promptsGETRandomly sample prompts for visibility check
/api/cron/analyze-visibilityPOSTCheck visibility across 8 AI platforms
/api/cron/store-visibility-reportPOSTStore daily visibility report
/api/cron/store-visibility-scorePOSTCalculate and store visibility score
Looking for prompt regeneration? See Regenerate Prompts in the Manual Trigger section.

Manual Trigger

Run all cron jobs immediately for testing or recovery:
curl -X POST https://searchcompany-main.up.railway.app/api/cron/trigger-all
⚠️ Warning: This can take up to 10 minutes depending on customer count.