Skip to main content
The Domain API handles connecting customer domains to our CloudFront proxy. This enables path-based routing where AI content lives at /ai/* while the rest of the site remains unchanged.

Services Used

ServicePurpose
AWS CloudFrontCDN and edge proxy - routes traffic based on URL path
AWS Lambda@EdgeRuns at CloudFront edge to route paths to correct origin
AWS ACMSSL certificate management (stores Let’s Encrypt + native certs)
AWS Global AcceleratorStatic IPs for apex domain support (A records)
Let’s EncryptInitial SSL certificates (bypasses CAA restrictions)
EntriOne-click DNS configuration for customers (no manual record editing)

Architecture

                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚              CloudFront Edge                β”‚
                    β”‚         (Lambda@Edge at origin-request)     β”‚
                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                        β”‚
                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚                                       β”‚
              AI Paths                              Everything Else
         /ai/*, /llms.txt,                         /, /products/*,
      /robots.txt, /sitemap.xml                   /collections/*, etc.
                    β”‚                                       β”‚
                    β–Ό                                       β–Ό
        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
        β”‚   AI Site (Vercel)β”‚                   β”‚   Shopify Store   β”‚
        β”‚ org-slug.search   β”‚                   β”‚ (customer's host) β”‚
        β”‚ company.dev       β”‚                   β”‚                   β”‚
        β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Path-Based Routing (No UA Cloaking)

We use path-based routing, NOT User-Agent based cloaking. This is safer and more transparent:
PathDestinationContent
/ai/*AI Site (Vercel)Markdown mirrors + FAQ pages
/llms.txtAI Site (Vercel)AI discovery file
/robots.txtAI Site (Vercel)Our robots.txt (overrides Shopify)
/sitemap.xmlAI Site (Vercel)Unified sitemap (Shopify + AI pages)
/search-company.txtAI Site (Vercel)IndexNow verification key
Everything elseShopifyOriginal store content

Why Path-Based?

  • No cloaking risk: Same content for bots and humans on same URLs
  • Transparent: Humans can access /ai/* pages too (they just won’t find them)
  • SEO safe: Clear separation between Shopify pages and AI pages
  • Canonical tags: /ai/* pages point canonical to Shopify URLs

Control File Takeover

We override these Shopify files at the edge:
FileWhat We ServeWhy
/robots.txtOur robots.txt with Sitemap: directiveSingle source of crawler directives
/sitemap.xmlUnified sitemap with ALL URLsOne sitemap containing Shopify + AI pages
/llms.txtAI discovery fileEntry point for AI crawlers

Unified Sitemap Strategy

Our /sitemap.xml contains all URLs in a single file:
  1. Shopify URLs - Fetched during onboarding and stored in ai_sites.shopify_sitemap_urls
  2. AI pages - /ai/products/*, /ai/collections/*, /ai/<faq-slug>
  3. AI files - /llms.txt, /llms/*.txt
This ensures:
  • No confusion from multiple sitemaps
  • Shopify indexing is preserved
  • AI pages are discoverable

Connection Flow

The domain connection is a two-part process designed for zero downtime:

Part 1: SSL Certificate + Google TXT

  1. User clicks β€œStart Secure Connection”
  2. We request a Let’s Encrypt certificate via DNS-01 challenge
  3. We also request a Google verification token
  4. User adds BOTH TXT records via Entri (one-click DNS):
    • SSL validation TXT record
    • Google verification TXT record
  5. Let’s Encrypt validates and issues certificate
  6. Certificate is imported to ACM and attached to CloudFront

Part 2: Connect Domain + Verify Google

  1. SSL is ready on CloudFront
  2. User clicks β€œPoint Domain”
  3. User configures DNS via Entri (one-click):
    • CNAME www β†’ CloudFront distribution
    • A @ β†’ Our gateway IPs (for naked domain)
  4. Traffic now flows through our proxy (both www and naked domain)
  5. Background tasks trigger:
    • ACM-native certificate upgrade (auto-renewal)
    • Google verification polling (10s Γ— 12 = 2 min)
    • Once Google verified β†’ IndexNow + GSC sitemap submission
    • If Google fails β†’ IndexNow only (GSC skipped)
Both www and naked domain are configured together. Whether the user enters www.example.com or example.com as their business URL, Part 2 sets up DNS records for both to ensure the full domain works.

Status Flow

PENDING_VALIDATION  β†’  SSL_VALIDATING  β†’  SSL_VALIDATED  β†’  DEPLOYED
      β”‚                      β”‚                  β”‚              β”‚
      β”‚                      β”‚                  β”‚              β”‚
  CloudFront             TXT record          Certificate    www CNAME
   created              added, waiting       attached to    points to
                        for Let's Encrypt    CloudFront     CloudFront
                                                               β”‚
                                                               β–Ό
                                                         DISCONNECTED
                                                               β”‚
                                                         (can reconnect
                                                          instantly)

Endpoints

EndpointPurpose
Get ProxyGet current proxy status and DNS records
Start CertificateBegin Let’s Encrypt certificate request
Complete CertificateFinish certificate after TXT record added
Start Google VerificationGet Google TXT verification token (Step 1)
Complete Google VerificationVerify domain + add to Search Console (now backend-driven in Step 2)
Resubmit SitemapResubmit sitemap after new AI pages (cron)
Mark Step CompleteUpdate status after Entri success + trigger Google verification polling
Verify DNSLive DNS lookup verification
Disconnect ProxyGet DNS records to restore original config
Mark Disconnect CompleteUpdate status after disconnect completes
Update Lambda@EdgeDeploy new routing rules to all distributions
Setup CloudFront Proxy is now part of the Onboarding flow - it runs automatically during generate-all.

Why Let’s Encrypt + ACM?

We use a hybrid approach for SSL certificates: Initial: Let’s Encrypt
  • Works with ANY DNS provider
  • Bypasses CAA restrictions (Vercel/Netlify block ACM)
  • User adds one TXT record, done
After Connection: ACM-Native
  • Once www points to CloudFront, ACM can validate
  • Background upgrade happens automatically
  • ACM-native certs auto-renew forever
  • User never has to touch DNS again

Lambda@Edge Path-Based Routing

The Lambda@Edge function at origin-request routes based on URL path:
// Paths that route to AI origin (Vercel)
const AI_PATHS = [
  "/ai/",              // All AI mirror pages
  "/llms.txt",         // AI discovery file
  "/robots.txt",       // We control this (overrides Shopify)
  "/sitemap.xml",      // Unified sitemap (overrides Shopify)
  "/search-company.txt" // IndexNow verification key
];

// Route to AI origin if path matches, else Shopify
const isAIPath = AI_PATHS.some(p => uri === p || uri.startsWith(p));
PathDestinationHost Header
/ai/*, /llms.txt, etc.org-slug.searchcompany.devAI site host
Everything elseCustomer’s Shopify originOriginal domain

Updating Routing Rules

When routing rules need to change:
# 1. Edit the Lambda code
vim src/app/apis/domain/setup_proxy/step_0_lambda_setup/lambda_code.js

# 2. Deploy to all distributions
curl -X POST https://api.searchcompany.ai/api/domain/update-lambda-edge \
  -H "Authorization: Bearer $TOKEN"
All customer distributions update automatically (~15 min for 1,000 customers). Zero downtime - old version runs until new one propagates.

Database Schema

The ai_sites table stores all proxy configuration:
ColumnPurpose
entity_idLinks to business entity
cloudfront_distribution_idCloudFront distribution ID
cloudfront_domaine.g., d123abc.cloudfront.net
custom_domaine.g., www.example.com
origin_cnameWhere www originally pointed
original_www_cnamePreserved for disconnect
certificate_arnACM certificate ARN
certificate_typeLETS_ENCRYPT or ACM_NATIVE
proxy_statusCurrent status
shopify_sitemap_urlsShopify URLs for unified sitemap
le_*Let’s Encrypt temp data
google_verification_tokenTXT record value from Google Site Verification API
google_verification_statusPENDING, VERIFIED, or FAILED
google_sitemap_submitted_atLast sitemap submission timestamp
Users can disconnect their domain and reconnect later:

Disconnect Flow

  1. User clicks β€œDisconnect Domain”
  2. Frontend calls /disconnect-proxy β†’ gets restore DNS records
  3. Entri restores www CNAME to original
  4. Frontend calls /mark-disconnect-complete β†’ status becomes DISCONNECTED
  5. CloudFront + ACM stay intact (for fast relink)
  1. User clicks β€œReconnect”
  2. No CloudFront setup needed - distribution already exists
  3. Start certificate (if expired) or skip
  4. 2-step DNS flow: SSL CNAME β†’ www CNAME
  5. Status becomes DEPLOYED
This is much faster than initial setup because CloudFront and ACM are preserved.

Testing

The domain connection can be tested end-to-end:
# Check current status
curl https://api.searchcompany.ai/api/domain/get-proxy/org_xxx \
  -H "Authorization: Bearer $TOKEN"

# Start certificate (Part 1)
curl -X POST https://api.searchcompany.ai/api/domain/start-certificate/org_xxx \
  -H "Authorization: Bearer $TOKEN"

# After adding TXT record, complete certificate
curl -X POST https://api.searchcompany.ai/api/domain/complete-certificate/org_xxx \
  -H "Authorization: Bearer $TOKEN"

# After switching www CNAME (Part 2)
curl -X POST https://api.searchcompany.ai/api/domain/mark-step-complete/org_xxx \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"step": 2}'

# Disconnect domain
curl -X POST https://api.searchcompany.ai/api/domain/disconnect-proxy/org_xxx \
  -H "Authorization: Bearer $TOKEN"

# After Entri restores DNS
curl -X POST https://api.searchcompany.ai/api/domain/mark-disconnect-complete/org_xxx \
  -H "Authorization: Bearer $TOKEN"