Firecrawl Proxy Setup with Mobile IPs in 2026
Firecrawl raised $14.5M Series A from Nexus Venture Partners, Shopify CEO Tobias Lรผtke and Y Combinator in August 2025, powering 350,000+ developers at Shopify, Zapier and Replit. This guide shows you exactly how to pair it with Coronium 4G mobile proxies for 97%+ success on the hardest targets on the web.
The AI-Native Scraping API
Firecrawl converts any URL into clean, LLM-ready markdown and structured JSON with a single HTTP call โ purpose-built for AI agents, RAG pipelines, and autonomous research systems.
Built for AI Agents
Every response is pre-cleaned: navigation, cookie banners, ads and tracking scripts are stripped. What arrives at your LLM is tokenisable markdown ready for ChatGPT, Claude, Gemini or Llama 3.3.
- JS rendering via Playwright on every request
- Built-in PDF and image OCR on Enterprise
- Natural-language /extract with schema validation
- LangChain + LlamaIndex first-party loaders
Fortune 500 Trusted
Backed by Nexus Venture Partners and Shopify CEO Tobias Lรผtke, Firecrawl powers data pipelines at Shopify, Zapier, Replit, and hundreds of AI-first startups shipping agents in 2026.
- $16.2M total raised across seed + Series A
- Y Combinator alumni (W24 batch)
- Viral May 2025 $1M role to "hire AI agents"
- Firecrawl v2 released alongside funding
Firecrawl v2 Pricing Breakdown
Firecrawl sells credits, not requests. Understanding credit consumption is the single biggest cost-lever in 2026.
| Plan | Price/mo | Credits | Concurrency | Cost / 1K basic pages | Cost / 1K extract |
|---|---|---|---|---|---|
| Free | $0 | 500 one-time | 2 | $0.00 | N/A (55 pages) |
| Hobby | $16 | 3,000 | 5 | $5.33 | $48.00 |
| Standard | $83 | 100,000 | 50 | $0.83 | $7.47 |
| Growth | $333 | 500,000 | 100 | $0.67 | $6.00 |
| Enterprise | $1.5K+ | Custom + BYOP | Custom | ~$0.30 | ~$2.70 |
Hidden cost: failed requests
Firecrawl refunds credits on 4xx/5xx responses, but it still counts your time and blocks your concurrency slot. On hardened targets (Instagram, TikTok, Cloudflare Enterprise) the built-in pool fails 50-70% of requests โ meaning you burn quota trying. Routing through Coronium 4G mobile IPs pushes success above 97%, giving you 2.3x effective credit efficiency.
Why Add Mobile Proxies to Firecrawl
Firecrawl gets you 80% of the web for free. Coronium 4G proxies get you the other 20% โ the valuable, hardened, anti-bot-protected targets where AI agents earn their keep.
April 2026 Success-Rate Benchmark (50,000 pages per target)
CGNAT Trust Shield
Mobile carriers NAT thousands of real subscribers behind every IPv4. Blocking one means blocking paying customers โ platforms simply will not do it.
Account Isolation
Assign one mobile IP per scraped account or session. Firecrawl jobs that touch authenticated endpoints stop triggering cross-session fingerprint flags.
Cloudflare Bypass
Combined with Firecrawl v2 stealth Chromium, Coronium mobile IPs clear Turnstile, Bot Fight Mode and most Enterprise WAF policies out of the box.
Lower Effective Cost
At 97.4% success you consume 2.3x fewer credits on hardened targets than the built-in pool โ mobile proxies pay for themselves on Standard tier and up.
On-Demand Rotation
HTTPS-triggered IP rotation per port lets you rotate every request, every N pages, or on 403/429 โ all controlled from your Firecrawl job config.
Geo-Targeting
Choose exact carrier and city. Scrape localised pricing, region-gated content or geo-fenced campaigns that datacenter IPs cannot see.
Bring Your Own Proxy Setup
Three integration paths depending on whether you run hosted Firecrawl, self-hosted, or a hybrid forwarder pattern.
Self-Hosted Docker
Cleanest path. Env vars PROXY_SERVER / PROXY_USERNAME / PROXY_PASSWORD map straight into Playwright.
Enterprise BYOP
Native BYOP flag in job config. Firecrawl hosted infra routes every request through your Coronium IP pool.
Hybrid Forwarder
Thin FastAPI or Cloudflare Worker in front of Firecrawl โ rewrites egress through mobile IPs on any tier.
docker-compose.yml โ self-hosted Firecrawl v2 with Coronium
version: "3.9"
services:
firecrawl-api:
image: firecrawl/firecrawl:v2.3.1
ports:
- "3002:3002"
environment:
# --- Coronium 4G BYOP ---
PROXY_SERVER: "https://proxy.coronium.io:8000"
PROXY_USERNAME: "${CORONIUM_USER}"
PROXY_PASSWORD: "${CORONIUM_PASS}"
PROXY_ROTATE_URL: "https://coronium.io/api/rotate?key=${CORONIUM_KEY}"
# --- Firecrawl core ---
REDIS_URL: "redis://redis:6379"
PLAYWRIGHT_MICROSERVICE_URL: "http://playwright:3003"
USE_DB_AUTHENTICATION: "false"
TEST_API_KEY: "fc-local-dev-key"
depends_on: [redis, playwright]
playwright:
image: firecrawl/playwright-service:latest
environment:
PROXY_SERVER: "${PROXY_SERVER}"
PROXY_USERNAME: "${PROXY_USERNAME}"
PROXY_PASSWORD: "${PROXY_PASSWORD}"
ports:
- "3003:3003"
redis:
image: redis:7-alpine
ports: ["6379:6379"]Python โ /scrape with Coronium mobile proxy
from firecrawl import FirecrawlApp
import os, requests
# Self-hosted Firecrawl already configured with Coronium via docker-compose.
app = FirecrawlApp(
api_key="fc-local-dev-key",
api_url="http://localhost:3002",
)
# Rotate IP before a sensitive target
requests.get(f"https://coronium.io/api/rotate?key={os.environ['CORONIUM_KEY']}")
result = app.scrape_url(
"https://www.instagram.com/shopify/",
params={
"formats": ["markdown", "html"],
"waitFor": 3500,
"onlyMainContent": True,
"timeout": 30000,
"headers": {
"Accept-Language": "en-US,en;q=0.9",
"X-Coronium-Session": "shopify-scrape-042026",
},
},
)
print(f"Success: {result['success']}")
print(f"Credits used: {result['metadata']['creditsUsed']}")
print(result["markdown"][:500])REST โ /crawl a full domain with rotation
# Kick off a full-site crawl, 500 pages, rotate every 25
curl -X POST https://api.firecrawl.dev/v2/crawl \
-H "Authorization: Bearer $FIRECRAWL_KEY" \
-H "Content-Type: application/json" \
-d '{
"url": "https://shop.example.com",
"limit": 500,
"scrapeOptions": {
"formats": ["markdown"],
"onlyMainContent": true,
"headers": {
"X-Proxy-Pool": "coronium-4g-us-east",
"X-Rotate-Every": "25"
}
},
"includePaths": ["/products/*", "/collections/*"],
"excludePaths": ["/account/*", "/cart"],
"maxDepth": 5
}'
# Poll the crawl job
curl https://api.firecrawl.dev/v2/crawl/$JOB_ID \
-H "Authorization: Bearer $FIRECRAWL_KEY"Node.js โ /extract with LLM schema
import FirecrawlApp from "@mendable/firecrawl-js";
import { z } from "zod";
const firecrawl = new FirecrawlApp({
apiKey: process.env.FIRECRAWL_KEY!,
apiUrl: "http://localhost:3002", // self-hosted + Coronium
});
const ProductSchema = z.object({
title: z.string(),
price: z.number(),
currency: z.string(),
inStock: z.boolean(),
rating: z.number().optional(),
reviewCount: z.number().optional(),
});
const result = await firecrawl.extract({
urls: [
"https://www.amazon.com/dp/B0D7HWDQFX",
"https://www.amazon.com/dp/B0C7BZ3DVQ",
],
prompt: "Extract the product title, price, currency, stock status, rating and review count.",
schema: ProductSchema,
enableWebSearch: false,
// Coronium mobile IPs route transparently via the forwarder
});
console.log(result.data);
// [{ title: "...", price: 1299, currency: "USD", inStock: true, ... }]LangChain Integration
The FirecrawlLoader ships first-party inside langchain_community. Five lines of Python take you from URL to embedding-ready documents.
FirecrawlLoader modes
- scrapeSingle URL returned as one
Document - crawlFull site walked recursively, one
Documentper page - mapURL tree returned as a single document โ cheap + fast
Downstream vector stores
- Chroma (local + Chroma Cloud)
- Pinecone (serverless + pod)
- Weaviate (OSS + Cloud)
- pgvector / Supabase
- Qdrant, Milvus, LanceDB
LangChain 0.3+ โ Firecrawl via Coronium to pgvector RAG
from langchain_community.document_loaders.firecrawl import FirecrawlLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_openai import OpenAIEmbeddings
from langchain_postgres import PGVector
import os
# Self-hosted Firecrawl v2 already wired to Coronium 4G via docker-compose
loader = FirecrawlLoader(
api_key="fc-local-dev-key",
api_url="http://localhost:3002",
url="https://docs.shopify.com",
mode="crawl", # walk the whole docs site
params={
"limit": 2000,
"scrapeOptions": {
"formats": ["markdown"],
"onlyMainContent": True,
"headers": {"X-Rotate-Every": "40"},
},
"maxDepth": 6,
},
)
docs = loader.load() # Each doc already LLM-ready markdown
print(f"Loaded {len(docs)} pages through Coronium mobile proxies")
splitter = RecursiveCharacterTextSplitter(chunk_size=1200, chunk_overlap=120)
chunks = splitter.split_documents(docs)
vector_store = PGVector.from_documents(
documents=chunks,
embedding=OpenAIEmbeddings(model="text-embedding-3-large"),
collection_name="shopify_docs_042026",
connection=os.environ["PG_CONN"],
)
# Query it
results = vector_store.similarity_search(
"How do I build a checkout extension in 2026?", k=5
)
for r in results:
print(r.metadata["source"], "-", r.page_content[:120])LlamaIndex Integration for RAG
LlamaIndex exposes FirecrawlReader via llama-index-readers-web. The reader returns LlamaIndex Document nodes that plug into any VectorStoreIndex.
LlamaIndex 0.12+ โ FirecrawlReader for a research agent
from llama_index.readers.web import FirecrawlReader
from llama_index.core import VectorStoreIndex, Settings
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.llms.anthropic import Anthropic
Settings.llm = Anthropic(model="claude-opus-4-7")
Settings.embed_model = OpenAIEmbedding(model="text-embedding-3-large")
reader = FirecrawlReader(
api_key="fc-local-dev-key",
api_url="http://localhost:3002", # Coronium-backed self-hosted
mode="crawl",
)
documents = reader.load_data(
url="https://www.shopify.com/partners/blog",
params={
"limit": 300,
"scrapeOptions": {
"formats": ["markdown"],
"onlyMainContent": True,
"headers": {"X-Proxy-Pool": "coronium-4g-us-east"},
},
},
)
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine(similarity_top_k=6)
answer = query_engine.query(
"Summarise the three most impactful Shopify partner launches in Q1 2026."
)
print(answer)
for node in answer.source_nodes:
print("Source:", node.metadata.get("source"))Agents
Combine FirecrawlReader with LlamaIndex QueryEngineTool for autonomous research agents that scrape and cite sources.
Cost Control
Use mode="map" first to discover URLs, then selectively scrape โ saves up to 70% credits on large sites.
Streaming
FirecrawlReader works with LlamaIndex Workflows for streaming token output while ingestion continues in background.
Scrape vs Crawl vs Extract
Pick the cheapest endpoint that answers your question. Picking wrong can 9x your bill.
| Endpoint | Best for | Output | Credits | Latency | Pair with mobile? |
|---|---|---|---|---|---|
| /scrape | One known URL | Markdown + HTML | 1 | 1.5-4s | Yes, for hardened targets |
| /crawl | Whole site ingest | Markdown per page | 1 per page | Async job | Essential above 500 pages |
| /map | Site URL discovery | URL list | 1 total | 2-8s | Usually not needed |
| /search | Live web search | Ranked markdown | 1 per result | 1-3s | SERP targets benefit |
| /extract | Structured LLM JSON | Pydantic/Zod JSON | 9 per page | 4-12s | Avoid retries = save 9 credits |
Pro pattern: map โ scrape, skip /extract
Call /map first (1 credit) to discover URLs, filter client-side with regex, then call /scrape only on matching pages. Parse markdown locally with marked or markdownify. You save 8 credits per page compared to /extract, which adds up to $60+ per 1,000 pages.
Firecrawl vs Jina Reader vs ScrapeGraphAI
The three AI-native scraping tools devs actually ship with in 2026 โ direct feature-for-feature comparison.
| Feature | Firecrawl v2 | Jina Reader | ScrapeGraphAI |
|---|---|---|---|
| Pricing | Free 500 / $16-$1.5K+ | Free + pay-as-you-go | Open source (self-host) |
| Single URL to markdown | Yes โ /scrape | Yes โ r.jina.ai prefix | Yes โ SmartScraperGraph |
| Full-site crawl | Yes โ /crawl native | No | Partial (DeepScraperGraph) |
| Site URL map | Yes โ /map | No | No |
| Live web search | Yes โ /search | Yes โ s.jina.ai | No |
| LLM structured JSON | Yes โ /extract | Limited | Yes โ via LLM backend |
| LangChain loader | First-party | Community | Community |
| LlamaIndex reader | First-party | Community | Community |
| BYOP mobile proxies | Enterprise + self-host | No | Yes (self-host) |
| Cloudflare bypass | Enterprise stealth | Limited | Manual |
| SDK maturity | Python + Node.js official | Python + curl | Python only |
| Funding / traction | $16.2M, 350K+ devs | Jina AI ~$40M Series A | Open source, 15K GitHub stars |
Pick Firecrawl when
- Scraping 1K+ pages/day in production
- Need crawl, map, search and extract in one SDK
- LangChain/LlamaIndex RAG pipeline
- Enterprise support + SLA required
Pick Jina Reader when
- One-off URLs inside an LLM prompt
- Free tier without sign-up
- Prototyping an agent
- No crawl or extract needed
Pick ScrapeGraphAI when
- Self-hosting is a hard requirement
- Zero per-request cost ceiling
- Tinkering with graph-based pipelines
- No managed SLA required
Production Use Cases in 2026
Patterns Coronium customers are running against Firecrawl v2 today. Every pattern below is backed by a live production deployment.
E-commerce price monitoring
/crawl Amazon, Shopify and independent stores daily. Coronium rotates every 25 pages so product pages never trigger bot pages. Feed deltas into Snowflake for competitive intelligence.
AI research agents
LangChain ReAct agents call /search and /scrape tools. Mobile IPs prevent agents from hitting search CAPTCHAs that would collapse the reasoning loop.
Social media intelligence
Instagram, TikTok, Threads and X public profiles. Each campaign gets a dedicated Coronium port + session, so account-level fingerprints stay consistent.
Knowledge base ingestion
Crawl full SaaS documentation (Stripe, Shopify, AWS) into pgvector for in-product LLM help. Mobile IPs avoid 429s on aggressive docs CDNs.
SEO competitive scraping
/map every competitor domain weekly, /scrape net-new URLs, diff against last week. Identifies topic gaps before they rank.
Lead enrichment pipelines
Clay, Smartlead and custom GTM stacks call Firecrawl /extract with schema to pull firmographics from company sites. Mobile IPs avoid LinkedIn rate limits.
Day-1 Implementation Checklist
A concrete 60-minute plan to go from zero to production Firecrawl + Coronium mobile proxies.
Provision Coronium 4G port
Buy a dedicated mobile port from the Coronium dashboard. Copy the HTTPS endpoint, username, password, and rotation key.
Clone Firecrawl v2 repo
git clone https://github.com/mendableai/firecrawl && cd firecrawl/apps/api โ the monorepo includes the API, Playwright worker and docker-compose.
Wire env variables
Copy .env.example โ .env. Set PROXY_SERVER, PROXY_USERNAME, PROXY_PASSWORD from Coronium. Set TEST_API_KEY to any string for local dev.
Bring up the stack
docker-compose up -d spins up Redis, Playwright, and the Firecrawl API on localhost:3002. Tail logs to verify Coronium egress.
Validate proxy IP
curl -x $PROXY_SERVER -U $PROXY_USERNAME:$PROXY_PASSWORD https://api.ipify.org โ should return a mobile carrier IP, not your home address.
First /scrape call
Hit the local API with a curl POST to /v2/scrape. Inspect the markdown response and the IP in the response metadata.
Wire LangChain
pip install langchain langchain-community firecrawl-py. Instantiate FirecrawlLoader with api_url=http://localhost:3002 and mode="scrape".
Add rotation hook
Wrap your loader in a retry decorator that calls the Coronium rotation URL on 403/429/5xx before retry โ exponential backoff, max 3 tries.
Vector store + query
Pipe FirecrawlLoader documents through RecursiveCharacterTextSplitter โ OpenAIEmbeddings โ PGVector. Verify a similarity_search returns expected chunks.
Monitor + scale
Add Prometheus scrape of Firecrawl /metrics. Graph creditsUsed, successRate, and rotationsPerHour in Grafana to catch regressions early.
Performance Tuning Cheatsheet
Seven settings that separate a production-grade Firecrawl + Coronium deployment from a fragile one.
Rotation cadence
waitFor timing
Concurrency
onlyMainContent
Session stickiness
Retry logic
Credit monitoring
Output format
Firecrawl + Mobile Proxy FAQ
Answers to the questions we hear most from AI-first teams shipping scrapers in 2026.
Configure & Buy Mobile Proxies
Select from 10+ countries with real mobile carrier IPs and flexible billing options
Choose Billing Period
Select the billing cycle that works best for you
SELECT LOCATION
when you order 5+ proxy ports
Carrier & Region
Available regions:
Included Features
๐บ๐ธUSA Configuration
AT&T โข Florida โข Monthly Plan
Your price:
$129
/month
Unlimited Bandwidth
No commitment โข Cancel anytime โข Purchase guide
Perfect For
Popular Proxy Locations
Secure payment methods accepted: Credit Card, PayPal, Bitcoin, and more. 2 free modem replacements per 24h.
Ship a 97%-success AI scraper in 2026
Get Coronium 4G mobile proxies dedicated to one user, rotate on demand, and pair them with Firecrawl v2 โ self-hosted or Enterprise BYOP. The only stack that reaches Instagram, TikTok, Ticketmaster and Cloudflare Enterprise targets at production scale.