Crawl4AI
extractionAssess
Crawl4AI is an open-source, Apache 2.0 web crawler and scraper optimized for LLM pipelines, AI agents, and data extraction workflows. It extracts content as Markdown or structured JSON, with chunking options and rich metadata (screenshots, network logs, citations).
Core Features
- Playwright-powered crawling with session handling and page interaction possibilities
- Markdown or JSON output with CSS/XPath or LLM-based extraction
- Adaptive link scoring and crawl-depth control
- Python API and CLI tools for integration and automation