docker安装firecrawl并使用

# docker安装firecrawl并使用

# 安装

将代码拉取到本地，仓库地址：https://github.com/mendableai/firecrawl
/firecrawl/apps/api目录下的配置拷贝一份 cp .env.example .env
返回根目录，运行docker-compose up -d

# API使用

api文档地址：https://docs.firecrawl.dev/introduction

Scrape:Turn any url into clean data，这个用得最多
Batch Scrape:Batch scrape multiple URLs，可以传多个url
Crawl:Used to crawl a URL and all accessible subpages. This submits a crawl job and returns a job ID to check the status of the crawl.这个是抓取页面及子页面，返回jobid，然后用get请求去查询这个job
Map:Used to map a URL and get urls of the website. This returns most links present on the website.返回这个链接中的其他链接，返回页面的结构。

# scrape

curl --request POST \
  --url http://127.0.0.1:3002/v1/scrape \
  --header 'content-type: application/json' \
  --data '{
  "url": "https://juejin.cn/post/7413964058788216869",
  "headers": {
            "Authorization": "Bearer your_access_token_here",
            "X-Custom-Auth": "custom_auth_value"
        }
}'

1
2
3
4
5
6
7
8
9
10

没有请求头的可以不设置Authorization，支持的格式有Available options: markdown, html,rawHtml, links, screenshot, screenshot@fullPage, json, changeTracking

# Batch Scrape

curl --request POST \
  --url https://127.0.0.1:3002/v1/batch/scrape \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
  "urls": [
    "<string>"
  ]
}'

1
2
3
4
5
6
7
8
9

# Crawl

curl --request POST \
  --url http://127.0.0.1:3002/v1/crawl \
  --header 'content-type: application/json' \
  --data '{
  "url": "https://juejin.cn",
  "limit": 10,
  "scrapeOptions":
  {
    "format": ["markdown"]
  }
}'

1
2
3
4
5
6
7
8
9
10
11

# Map

curl --request POST \
  --url https://127.0.0.1:3002/v1/map \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
  "url": "<string>",
  "search": "<string>",
  "ignoreSitemap": true,
  "sitemapOnly": false,
  "includeSubdomains": false,
  "limit": 5000,
  "timeout": 123
}'

1
2
3
4
5
6
7
8
9
10
11
12
13

# 结合MCP使用

安装Firecrawl mcpServers，地址：https://github.com/mendableai/firecrawl-mcp-server/tree/main
手动安装：npm install -g firecrawl-mcp
cline中配置使用mcp

{
  "mcpServers": {
    "mcp-server-firecrawl": {
      "command": "npx",
      "args": ["-y", "firecrawl-mcp"],
      "env": {
        "FIRECRAWL_API_KEY": "YOUR_API_KEY_HERE"
      }
    }
  }
}

1
2
3
4
5
6
7
8
9
10
11

注意：本地部署firecrawl的话，不用配置FIRECRAWL_API_KEY，需要FIRECRAWL_API_URL

上次更新: 2025-05-22 06:40:28

← 老毛子Padavan安装应用 Python使用UV指南→