Skip to main content
Internal Service β€” This is not an HTTP endpoint. It’s called directly by the generate-all orchestrator as a fire-and-forget background task.

Purpose

Uses Firecrawl’s Agent API to discover up to 10 competitors for a business. Runs in GROUP 1c (parallel with scrape, non-blocking).

Function Signature

async def discover_competitors(
    business_url: str,
    business_name: str,
    entity_id: str,
    max_competitors: int = 10
) -> dict

Parameters

ParameterTypeDefaultDescription
business_urlstrrequiredThe business website URL
business_namestrrequiredThe business name
entity_idstrrequiredThe business entity ID
max_competitorsint10Maximum competitors to find

Returns

{
  "status": "success",
  "count": 8,
  "competitors": [
    {
      "name": "Competitor A",
      "url": "https://competitor-a.com",
      "description": "A leading provider of..."
    },
    {
      "name": "Competitor B", 
      "url": "https://competitor-b.com",
      "description": "Enterprise solution for..."
    }
  ]
}

Firecrawl Agent API

This service uses Firecrawl’s Agent API (POST /v2/agent), an AI-powered endpoint that can browse the web and extract structured data based on a natural language prompt.

Database Schema

Competitors are stored in the competitors table:
CREATE TABLE competitors (
  id UUID PRIMARY KEY,
  entity_id UUID REFERENCES entities(id),
  name TEXT NOT NULL,
  url TEXT NOT NULL,
  description TEXT,
  favicon_url TEXT,
  source TEXT DEFAULT 'firecrawl_agent',
  created_at TIMESTAMPTZ DEFAULT NOW()
);

Execution Timing

GROUP 1a (Business Info) β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ
GROUP 1b (Scrape)        β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ
GROUP 1c (Competitors)   β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ  ← Parallel, non-blocking
GROUP 1d (Products)      β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ  ← Parallel, non-blocking
GROUP 2                  β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ  ← Starts when 1a+1b+1d complete
Competitor discovery runs in parallel with scraping but doesn’t block GROUP 2. This means if competitor discovery takes longer, the main onboarding flow continues without waiting.

Code Location

src/app/apis/onboarding/services/discover_competitors/
β”œβ”€β”€ __init__.py
β”œβ”€β”€ service.py              # Main discover_competitors function
└── children/
    β”œβ”€β”€ favicon_handler.py  # Fetches competitor favicons
    └── firecrawl_agent.py  # Firecrawl Agent API wrapper

Error Handling

{
  "status": "error",
  "error": "Firecrawl Agent API timeout"
}
Since this is fire-and-forget, errors don’t fail the entire onboarding. The orchestrator logs the error and continues.