Services Used
| Service | Purpose |
|---|---|
| AWS CloudFront | CDN and edge proxy - routes traffic based on bot detection |
| AWS Lambda@Edge | Runs at CloudFront edge to detect AI bots and rewrite origins |
| AWS ACM | SSL certificate management (stores Letβs Encrypt + native certs) |
| AWS Global Accelerator | Static IPs for apex domain support (A records) |
| Letβs Encrypt | Initial SSL certificates (bypasses CAA restrictions) |
| Entri | One-click DNS configuration for customers (no manual record editing) |
Architecture
Connection Flow
The domain connection is a two-part process designed for zero downtime:Part 1: SSL Certificate + Google TXT
- User clicks βStart Secure Connectionβ
- We request a Letβs Encrypt certificate via DNS-01 challenge
- We also request a Google verification token
- User adds BOTH TXT records via Entri (one-click DNS):
- SSL validation TXT record
- Google verification TXT record
- Letβs Encrypt validates and issues certificate
- Certificate is imported to ACM and attached to CloudFront
Part 2: Connect Domain + Verify Google
- SSL is ready on CloudFront
- User clicks βPoint Domainβ
- User switches www CNAME from origin to CloudFront via Entri
- Traffic now flows through our proxy
- Background tasks trigger:
- ACM-native certificate upgrade (auto-renewal)
- Google verification polling (10s Γ 12 = 2 min)
- Once Google verified β IndexNow + GSC sitemap submission
- If Google fails β IndexNow only (GSC skipped)
Status Flow
Endpoints
| Endpoint | Purpose |
|---|---|
| Get Proxy | Get current proxy status and DNS records |
| Start Certificate | Begin Letβs Encrypt certificate request |
| Complete Certificate | Finish certificate after TXT record added |
| Start Google Verification | Get Google TXT verification token (Step 1) |
| Complete Google Verification | Verify domain + add to Search Console (now backend-driven in Step 2) |
| Resubmit Sitemap | Resubmit sitemap after new boosted pages (cron) |
| Mark Step Complete | Update status after Entri success + trigger Google verification polling |
| Verify DNS | Live DNS lookup verification |
| Disconnect Proxy | Get DNS records to restore original config |
| Mark Disconnect Complete | Update status after disconnect completes |
| Update Lambda@Edge | Deploy new bot patterns to all distributions |
Setup CloudFront Proxy is now part of the Onboarding flow - it runs automatically during
generate-all.Why Letβs Encrypt + ACM?
We use a hybrid approach for SSL certificates: Initial: Letβs Encrypt- Works with ANY DNS provider
- Bypasses CAA restrictions (Vercel/Netlify block ACM)
- User adds one TXT record, done
- Once www points to CloudFront, ACM can validate
- Background upgrade happens automatically
- ACM-native certs auto-renew forever
- User never has to touch DNS again
Lambda@Edge Bot Detection
The Lambda@Edge function atorigin-request detects AI bots by User-Agent:
| Traffic Type | Destination | Host Header |
|---|---|---|
| AI Bot | org-slug.searchcompany.dev | AI site host |
| Human | Customerβs origin | Original domain |
IndexNow Key File Routing
The Lambda also routes/search-company.txt to the AI site regardless of User-Agent. This ensures IndexNow can verify domain ownership when we submit URLs.
Without this, IndexNowβs verification request would go to the human website (which doesnβt have the key file) and fail.
Bot Patterns - Single Source of Truth
Bot patterns are defined in one place:src/app/shared/bot_identifiers.py
This file is used by:
- AI Recommendation Store - Identifying which AI platform visited
- Lambda@Edge - Generated via
generate_lambda_code.py
Updating Bot Patterns
When new AI bots emerge:Database Schema
Theai_sites table stores all proxy configuration:
| Column | Purpose |
|---|---|
entity_id | Links to business entity |
cloudfront_distribution_id | CloudFront distribution ID |
cloudfront_domain | e.g., d123abc.cloudfront.net |
custom_domain | e.g., www.example.com |
origin_cname | Where www originally pointed |
original_www_cname | Preserved for disconnect |
certificate_arn | ACM certificate ARN |
certificate_type | LETS_ENCRYPT or ACM_NATIVE |
proxy_status | Current status |
le_* | Letβs Encrypt temp data |
google_verification_token | TXT record value from Google Site Verification API |
google_verification_status | PENDING, VERIFIED, or FAILED |
google_sitemap_submitted_at | Last sitemap submission timestamp |
Disconnect and Relink Flow
Users can disconnect their domain and reconnect later:Disconnect Flow
- User clicks βDisconnect Domainβ
- Frontend calls
/disconnect-proxyβ gets restore DNS records - Entri restores www CNAME to original
- Frontend calls
/mark-disconnect-completeβ status becomes DISCONNECTED - CloudFront + ACM stay intact (for fast relink)
Relink Flow (Fast)
- User clicks βReconnectβ
- No CloudFront setup needed - distribution already exists
- Start certificate (if expired) or skip
- 2-step DNS flow: SSL CNAME β www CNAME
- Status becomes DEPLOYED