API Documentation

Everything you need to start scraping the web with CrawlerAPI. Our REST API gives you programmatic access to the most powerful web scraping infrastructure available.

Quick Start

1 Create an account and get your free 1,000 credits.
2 Generate an API key from your dashboard.
3 Make your first API request using the example below.

Base URL

https://api.crawlerapi.ai/api/v1/

Your First Request

cURL

## Scrape a webpage and get the HTML content
curl -X POST https://api.crawlerapi.ai/api/v1/scrape \
  -H "Authorization: Api-Key YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com",
    "formats": ["json", "html"]
  }'

Authentication

All API requests require authentication using an API key. Include your key in the Authorization header of every request.

Header Format

Authorization: Api-Key YOUR_API_KEY

Keep your API key secure

Never expose your API key in client-side code, public repositories, or browser requests. Use server-side calls or environment variables.

Example with Python

Python

import requests

headers = {
    "Authorization": "Api-Key YOUR_API_KEY",
    "Content-Type": "application/json"
}

response = requests.post(
    "https://api.crawlerapi.ai/api/v1/scrape",
    headers=headers,
    json={"url": "https://example.com"}
)

data = response.json()
print(data)

POST

Scrape

Extract data from a single URL. Returns the page content in your chosen format with optional element targeting via CSS selectors.

POST /api/v1/scrape

Parameters

Parameter	Type	Required	Description
url	string	Yes	The URL to scrape
formats	array	No	Output formats: json, html, text
selectors	object	No	CSS selectors to extract specific elements
render_js	boolean	No	Enable JavaScript rendering (default: false)
wait_for	string	No	CSS selector to wait for before extraction
proxy_country	string	No	ISO country code for geo-targeted proxy

Example

Python

response = requests.post(
    "https://api.crawlerapi.ai/api/v1/scrape",
    headers=headers,
    json={
        "url": "https://example.com/product/123",
        "formats": ["json"],
        "selectors": {
            "title": "h1.product-title",
            "price": ".price-current",
            "rating": ".star-rating",
            "description": "#product-description"
        },
        "render_js": True,
        "proxy_country": "US"
    }
)

POST

Crawl

Crawl an entire website or a set of pages. The crawler follows links according to your rules and extracts data from each page visited.

POST /api/v1/crawl

Parameters

Parameter	Type	Required	Description
url	string	Yes	Starting URL for the crawl
max_pages	integer	No	Maximum number of pages to crawl (default: 10)
max_depth	integer	No	Maximum link depth from start URL (default: 3)
include_patterns	array	No	URL patterns to include (regex)
exclude_patterns	array	No	URL patterns to exclude (regex)
selectors	object	No	CSS selectors to apply to each page

Example

Python

response = requests.post(
    "https://api.crawlerapi.ai/api/v1/crawl",
    headers=headers,
    json={
        "url": "https://example.com/blog/",
        "max_pages": 50,
        "max_depth": 2,
        "include_patterns": ["/blog/.*"],
        "selectors": {
            "title": "h1",
            "content": "article",
            "date": "time[datetime]"
        }
    }
)

# The crawl runs asynchronously. Poll for results:
crawl_id = response.json()["crawl_id"]

POST

Extract (AI)

Use AI to extract structured data from any page without writing selectors. Describe the data you need in natural language and optionally define an output schema.

Note: AI extraction is in beta.

POST /api/v1/extract

Parameters

Parameter	Type	Required	Description
url	string	Yes	The URL to extract data from
prompt	string	Yes	Natural language description of what to extract
schema	object	No	JSON schema defining the expected output structure
render_js	boolean	No	Enable JavaScript rendering (default: true)

Example

Python

response = requests.post(
    "https://api.crawlerapi.ai/api/v1/extract",
    headers=headers,
    json={
        "url": "https://news.example.com",
        "prompt": "Extract all articles with their headlines, authors, dates, and summaries",
        "schema": {
            "articles": [{
                "headline": "string",
                "author": "string",
                "date": "date",
                "summary": "string"
            }]
        }
    }
)

POST

Screenshot

Capture full-page or element-level screenshots of any URL. Supports PNG, JPEG, and PDF output with configurable viewport dimensions.

POST /api/v1/screenshot

Parameters

Parameter	Type	Required	Description
url	string	Yes	The URL to screenshot
format	string	No	Output format: png, jpeg, pdf (default: png)
full_page	boolean	No	Capture full page or viewport only (default: true)
viewport_width	integer	No	Viewport width in pixels (default: 1920)
viewport_height	integer	No	Viewport height in pixels (default: 1080)
selector	string	No	CSS selector to screenshot a specific element

Example

cURL

curl -X POST https://api.crawlerapi.ai/api/v1/screenshot \
  -H "Authorization: Api-Key YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com",
    "format": "png",
    "full_page": true,
    "viewport_width": 1440,
    "viewport_height": 900
  }'

POST

Map

Discover and map all URLs on a website. Returns a structured sitemap with link hierarchy, giving you a complete picture of the site's architecture before crawling.

POST /api/v1/map

Parameters

Parameter	Type	Required	Description
url	string	Yes	The website URL to map
max_pages	integer	No	Maximum pages to discover (default: 100)
include_subdomains	boolean	No	Include subdomains in the map (default: false)
respect_robots	boolean	No	Respect robots.txt rules (default: true)

Example

JavaScript (Node.js)

const response = await fetch('https://api.crawlerapi.ai/api/v1/map', {
    method: 'POST',
    headers: {
        'Authorization': 'Api-Key YOUR_API_KEY',
        'Content-Type': 'application/json'
    },
    body: JSON.stringify({
        "url": "https://example.com",
        "max_pages": 200,
        "include_subdomains": true
    })
});

const data = await response.json();
console.log(`Found ${data.urls.length} URLs`);

Response Formats

All API responses follow a consistent JSON structure. Successful responses include the requested data along with metadata about the request.

Success Response

200 OK

{
  "status": "success",
  "credits_used": 1,
  "data": {
    "url": "https://example.com",
    "content": { ... },
    "metadata": {
      "status_code": 200,
      "response_time_ms": 342,
      "proxy_country": "US",
      "rendered_js": false
    }
  },
  "request_id": "req_abc123def456"
}

Error Response

4xx / 5xx

{
  "status": "error",
  "error": {
    "code": "INSUFFICIENT_CREDITS",
    "message": "You do not have enough credits to perform this request."
  },
  "request_id": "req_abc123def456"
}

Common Error Codes

HTTP Status	Code	Description
401	UNAUTHORIZED	Missing or invalid API key
402	INSUFFICIENT_CREDITS	Not enough credits for this request
422	VALIDATION_ERROR	Invalid request parameters
429	RATE_LIMITED	Too many requests -- slow down
500	INTERNAL_ERROR	An unexpected error occurred on our end
502	TARGET_UNREACHABLE	The target website could not be reached

Rate Limits

Rate limits vary by plan and are enforced per API key. Rate limit headers are included in every response.

Plan	Requests / Minute	Concurrent Requests
Free	20	5
Starter	60	25
Professional	200	100
Enterprise	500	Unlimited

Rate Limit Headers

Response Headers

X-RateLimit-Limit: 200
X-RateLimit-Remaining: 198
X-RateLimit-Reset: 1709164800
X-Credits-Remaining: 4750

Code Examples

Get started quickly with these code examples in your preferred language.

Py

Python

Copy and paste

import requests

response = requests.post(
    "https://api.crawlerapi.ai/api/v1/scrape",
    headers={"Authorization": "Api-Key YOUR_API_KEY"},
    json={"url": "https://example.com"}
)
print(response.json())

JS

Node.js

Copy and paste

const response = await fetch('https://api.crawlerapi.ai/api/v1/scrape', {
    method: 'POST',
    headers: {
        'Authorization': 'Api-Key YOUR_API_KEY',
        'Content-Type': 'application/json'
    },
    body: JSON.stringify({"url": "https://example.com"})
});
const data = await response.json();
console.log(data);

Rb

Ruby

Copy and paste

require 'net/http'
require 'json'

uri = URI("https://api.crawlerapi.ai/api/v1/scrape")
req = Net::HTTP::Post.new(uri, {
    "Authorization" => "Api-Key YOUR_API_KEY",
    "Content-Type" => "application/json"
})
req.body = {"url": "https://example.com"}.to_json
res = Net::HTTP.start(uri.hostname, uri.port, use_ssl: true) { |http| http.request(req) }
puts JSON.parse(res.body)

Go

Copy and paste

body := strings.NewReader(`{"url":"https://example.com"}`)
req, _ := http.NewRequest("POST", "https://api.crawlerapi.ai/api/v1/scrape", body)
req.Header.Set("Authorization", "Api-Key YOUR_API_KEY")
req.Header.Set("Content-Type", "application/json")
resp, _ := http.DefaultClient.Do(req)
defer resp.Body.Close()
fmt.Println(resp.Status)

API Documentation

Quick Start

Base URL

Your First Request

Authentication

Example with Python

Scrape

Parameters

Example

Crawl

Parameters

Example

Extract (AI)

Parameters

Example

Screenshot

Parameters

Example

Map

Parameters

Example

Response Formats

Success Response

Error Response

Common Error Codes

Rate Limits

Rate Limit Headers

Code Examples

Python

Node.js

Ruby

Go

Need help?