Web Scrapping

Self Hosted Scrappers

Web scrappers with AI specific outputs for feeding data into LLMs.

Firecrawl

Official self-hosted

Jina Reader

Hacked self-hosted

Becareful: do NOT use the official jina.ai version - it is a cloud service.

ScapeGraphAI

Official self-hosted

Anything LLM

Write your own web scrapping code e.g.