API Documentation
Everything you need to start scraping the web with CrawlerAPI. Our REST API gives you programmatic access to the most powerful web scraping infrastructure available.
Quick Start
- 1 Create an account and get your free 1,000 credits.
- 2 Generate an API key from your dashboard.
- 3 Make your first API request using the example below.
Base URL
https://api.crawlerapi.ai/api/v1/
Your First Request
## Scrape a webpage and get the HTML content
curl -X POST https://api.crawlerapi.ai/api/v1/scrape \
-H "Authorization: Api-Key YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"url": "https://example.com",
"formats": ["json", "html"]
}'
Authentication
All API requests require authentication using an API key. Include your key in the Authorization header of every request.
Authorization: Api-Key YOUR_API_KEY
Keep your API key secure
Never expose your API key in client-side code, public repositories, or browser requests. Use server-side calls or environment variables.
Example with Python
import requests
headers = {
"Authorization": "Api-Key YOUR_API_KEY",
"Content-Type": "application/json"
}
response = requests.post(
"https://api.crawlerapi.ai/api/v1/scrape",
headers=headers,
json={"url": "https://example.com"}
)
data = response.json()
print(data)
Scrape
Extract data from a single URL. Returns the page content in your chosen format with optional element targeting via CSS selectors.
POST /api/v1/scrape
Parameters
| Parameter | Type | Required |
|---|---|---|
| url | string | Yes |
| formats | array | No |
| selectors | object | No |
| render_js | boolean | No |
| wait_for | string | No |
| proxy_country | string | No |
Example
response = requests.post(
"https://api.crawlerapi.ai/api/v1/scrape",
headers=headers,
json={
"url": "https://example.com/product/123",
"formats": ["json"],
"selectors": {
"title": "h1.product-title",
"price": ".price-current",
"rating": ".star-rating",
"description": "#product-description"
},
"render_js": True,
"proxy_country": "US"
}
)
Crawl
Crawl an entire website or a set of pages. The crawler follows links according to your rules and extracts data from each page visited.
POST /api/v1/crawl
Parameters
| Parameter | Type | Required |
|---|---|---|
| url | string | Yes |
| max_pages | integer | No |
| max_depth | integer | No |
| include_patterns | array | No |
| exclude_patterns | array | No |
| selectors | object | No |
Example
response = requests.post(
"https://api.crawlerapi.ai/api/v1/crawl",
headers=headers,
json={
"url": "https://example.com/blog/",
"max_pages": 50,
"max_depth": 2,
"include_patterns": ["/blog/.*"],
"selectors": {
"title": "h1",
"content": "article",
"date": "time[datetime]"
}
}
)
# The crawl runs asynchronously. Poll for results:
crawl_id = response.json()["crawl_id"]
Extract (AI)
Use AI to extract structured data from any page without writing selectors. Describe the data you need in natural language and optionally define an output schema.
Note: AI extraction is in beta.
POST /api/v1/extract
Parameters
| Parameter | Type | Required |
|---|---|---|
| url | string | Yes |
| prompt | string | Yes |
| schema | object | No |
| render_js | boolean | No |
Example
response = requests.post(
"https://api.crawlerapi.ai/api/v1/extract",
headers=headers,
json={
"url": "https://news.example.com",
"prompt": "Extract all articles with their headlines, authors, dates, and summaries",
"schema": {
"articles": [{
"headline": "string",
"author": "string",
"date": "date",
"summary": "string"
}]
}
}
)
Screenshot
Capture full-page or element-level screenshots of any URL. Supports PNG, JPEG, and PDF output with configurable viewport dimensions.
POST /api/v1/screenshot
Parameters
| Parameter | Type | Required |
|---|---|---|
| url | string | Yes |
| format | string | No |
| full_page | boolean | No |
| viewport_width | integer | No |
| viewport_height | integer | No |
| selector | string | No |
Example
curl -X POST https://api.crawlerapi.ai/api/v1/screenshot \
-H "Authorization: Api-Key YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"url": "https://example.com",
"format": "png",
"full_page": true,
"viewport_width": 1440,
"viewport_height": 900
}'
Map
Discover and map all URLs on a website. Returns a structured sitemap with link hierarchy, giving you a complete picture of the site's architecture before crawling.
POST /api/v1/map
Parameters
| Parameter | Type | Required |
|---|---|---|
| url | string | Yes |
| max_pages | integer | No |
| include_subdomains | boolean | No |
| respect_robots | boolean | No |
Example
const response = await fetch('https://api.crawlerapi.ai/api/v1/map', {
method: 'POST',
headers: {
'Authorization': 'Api-Key YOUR_API_KEY',
'Content-Type': 'application/json'
},
body: JSON.stringify({
"url": "https://example.com",
"max_pages": 200,
"include_subdomains": true
})
});
const data = await response.json();
console.log(`Found ${data.urls.length} URLs`);
Response Formats
All API responses follow a consistent JSON structure. Successful responses include the requested data along with metadata about the request.
Success Response
{
"status": "success",
"credits_used": 1,
"data": {
"url": "https://example.com",
"content": { ... },
"metadata": {
"status_code": 200,
"response_time_ms": 342,
"proxy_country": "US",
"rendered_js": false
}
},
"request_id": "req_abc123def456"
}
Error Response
{
"status": "error",
"error": {
"code": "INSUFFICIENT_CREDITS",
"message": "You do not have enough credits to perform this request."
},
"request_id": "req_abc123def456"
}
Common Error Codes
| HTTP Status | Code |
|---|---|
| 401 | UNAUTHORIZED |
| 402 | INSUFFICIENT_CREDITS |
| 422 | VALIDATION_ERROR |
| 429 | RATE_LIMITED |
| 500 | INTERNAL_ERROR |
| 502 | TARGET_UNREACHABLE |
Rate Limits
Rate limits vary by plan and are enforced per API key. Rate limit headers are included in every response.
| Plan | Requests / Minute | Concurrent Requests |
|---|---|---|
| Free | 20 | 5 |
| Starter | 60 | 25 |
| Professional | 200 | 100 |
| Enterprise | 500 | Unlimited |
Rate Limit Headers
X-RateLimit-Limit: 200
X-RateLimit-Remaining: 198
X-RateLimit-Reset: 1709164800
X-Credits-Remaining: 4750
Code Examples
Get started quickly with these code examples in your preferred language.
Python
Copy and paste
import requests
response = requests.post(
"https://api.crawlerapi.ai/api/v1/scrape",
headers={"Authorization": "Api-Key YOUR_API_KEY"},
json={"url": "https://example.com"}
)
print(response.json())
Node.js
Copy and paste
const response = await fetch('https://api.crawlerapi.ai/api/v1/scrape', {
method: 'POST',
headers: {
'Authorization': 'Api-Key YOUR_API_KEY',
'Content-Type': 'application/json'
},
body: JSON.stringify({"url": "https://example.com"})
});
const data = await response.json();
console.log(data);
Ruby
Copy and paste
require 'net/http'
require 'json'
uri = URI("https://api.crawlerapi.ai/api/v1/scrape")
req = Net::HTTP::Post.new(uri, {
"Authorization" => "Api-Key YOUR_API_KEY",
"Content-Type" => "application/json"
})
req.body = {"url": "https://example.com"}.to_json
res = Net::HTTP.start(uri.hostname, uri.port, use_ssl: true) { |http| http.request(req) }
puts JSON.parse(res.body)
Go
Copy and paste
body := strings.NewReader(`{"url":"https://example.com"}`)
req, _ := http.NewRequest("POST", "https://api.crawlerapi.ai/api/v1/scrape", body)
req.Header.Set("Authorization", "Api-Key YOUR_API_KEY")
req.Header.Set("Content-Type", "application/json")
resp, _ := http.DefaultClient.Do(req)
defer resp.Body.Close()
fmt.Println(resp.Status)
Need help?
Our team is here to help you get the most out of CrawlerAPI.