logo
icon

Firecrawl

Open-source web crawler and scraper API for AI applications, knowledge bases, and data pipelines.

Open-source web crawler and scraper API for AI applications, knowledge bases, and data pipelines.

PlatformZeabur
Deployed2
PublisherzeaburZeabur

Firecrawl

🚀 Firecrawl is an open-source web crawler and scraper API for AI applications, knowledge bases, and data pipelines.

This template deploys the full self-hosted Firecrawl stack:

  • api — main Firecrawl process. Runs harness.js --start-docker, which manages the API server plus worker, extract-worker, and several NUQ queue worker subprocesses inside the same container.
  • playwright — browser automation microservice for JS-rendered pages.
  • redis — cache and rate-limit store.
  • rabbitmq — broker for the NUQ task queue.
  • postgres — Postgres 17 + pg_cron, pre-loaded with the NUQ queue schema.

Resource recommendation

Firecrawl is resource-intensive. The upstream docker-compose.yaml recommends at least 4 vCPU / 8 GB RAM for the api service and 2 vCPU / 4 GB RAM for playwright. We recommend deploying this template on a Pro plan or a dedicated server.

Configuration

Only the public domain is required. The deploy form also offers an optional Zeabur AI Hub API Key field — click Generate to mint one on the spot, and the api container auto-configures itself to use AI Hub on first start: OPENAI_BASE_URL=https://hnd1.aihub.zeabur.ai/v1, MODEL_NAME=claude-sonnet-4-5. AI Hub bills against your Zeabur credits, no separate OpenAI signup needed.

Other LLM providers (OpenAI direct, OpenRouter, xAI, Ollama, …): leave the AI Hub field blank, deploy, then set OPENAI_API_KEY (and optionally OPENAI_BASE_URL, MODEL_NAME) on the api service env tab. The auto-config only fires when AI Hub is chosen at deploy time, so user-set values are never overwritten. The full list of supported variables lives in apps/api/src/config.ts.

Already deployed with AI Hub and want to switch later? Just set OPENAI_API_KEY (your own key) on the api env tab and restart the api service — the wrapper sees the user-set value and skips the AI Hub auto-config.

Caveats when running on AI Hub:

  • AI Hub does not currently provide embedding models. Crawl link relevance ranking falls back to no-ranking — the rest of the LLM features (extract, agent, summary) are unaffected.
  • A few Firecrawl features (interactive browser-agent, direct-quote handling) call Google Gemini directly via @ai-sdk/google rather than through OPENAI_BASE_URL. To enable them, also set GOOGLE_GENERATIVE_AI_API_KEY on the api service.

BULL_AUTH_KEY (which protects the internal Bull queue dashboard) is auto-generated; you can find it on the api service env tab.

Authentication is disabled by default (USE_DB_AUTHENTICATION=false). The API endpoint is public — protect your domain with Cloudflare, an auth proxy, or a network ACL if exposed to the internet.

Quick test

After deployment finishes, hit the scrape endpoint:

curl -X POST https://<your-domain>/v1/scrape \
  -H 'Content-Type: application/json' \
  -d '{"url": "https://docs.firecrawl.dev"}'

Queue dashboard

Firecrawl has no end-user UI — it is an API-only service. For operators, a built-in Bull Dashboard is mounted at:

https://<your-domain>/admin/<BULL_AUTH_KEY>/queues

Use it to monitor queue throughput, inspect active / completed / failed jobs, and replay errors. Find your BULL_AUTH_KEY on the api service env tab.

Versioning

Firecrawl upstream does not publish semver tags — only :latest, which is rebuilt every time main advances. To give you reproducible deploys, this template pins each image by digest (@sha256:...). The current pin corresponds to the build of 2026-05-04. Updates ship as new template revisions.

Reference