web-scraper
There are 1170 repositories under web-scraper topic.
firecrawl/firecrawl
🔥 The Web Data API for AI - Turn entire websites into LLM-ready markdown or structured data
ScrapeGraphAI/Scrapegraph-ai
Python scraper based on AI
getmaxun/maxun
⚡ Easiest no code web data extraction platform • Instantly turn any website into API or spreadsheet ⚡
D4Vinci/Scrapling
🕷️ An undetectable, powerful, flexible, high-performance Python library to make Web Scraping Easy and Effortless as it should be!
BruceDone/awesome-crawler
A collection of awesome web crawler,spider in different languages
jaypyles/Scraperr
Self-hosted webscraper.
arpit-omprakash/100ProjectsOfCode
A list of practical knowledge-building projects.
php-curl-class/php-curl-class
PHP Curl Class makes it easy to send HTTP requests and integrate with web APIs
gosom/google-maps-scraper
scrape data data from Google Maps. Extracts data such as the name, address, phone number, website URL, rating, reviews number, latitude and longitude, reviews,email and more for each place
anaskhan96/soup
Web Scraper in Go, similar to BeautifulSoup
dipu-bd/lightnovel-crawler
Generate and download e-books from online sources.
itsOwen/CyberScraper-2077
A Powerful web scraper powered by LLM | OpenAI, Gemini & Ollama
oxylabs/google-ai-mode-scraper
Scrape Google AI Mode responses without blocks on a large scale.
oxylabs/how-to-scrape-amazon-product-data
The process of extracting product data from Amazon using Python, including titles, ratings, prices, images, and descriptions.
juancarlospaco/faster-than-requests
Faster requests on Python 3
tholian-network/stealth
:rocket: Stealth - Secure, Peer-to-Peer, Private and Automateable Web Browser/Scraper/Proxy
gildas-lormeau/single-file-cli
CLI tool for saving a faithful copy of a complete web page in a single HTML file (based on SingleFile)
Oshan96/monkey-dl
Bulk download your favourite anime episodes from your favourite anime websites
je-suis-tm/web-scraping
Detailed web scraping tutorials for dummies with financial data crawlers on Reddit WallStreetBets, CME (both options and futures), US Treasury, CFTC, LME, MacroTrends, SHFE and alternative data crawlers on Tomtom, BBC, Wall Street Journal, Al Jazeera, Reuters, Financial Times, Bloomberg, CNN, Fortune, The Economist
postmodern/spidr
A versatile Ruby web spidering library that can spider a site, multiple domains, certain links or infinitely. Spidr is designed to be fast and easy to use.
k0rnh0li0/onlyfans-dl
OnlyFans content downloader
cassidoo/scrapers
A list of scrapers from around the web.
oxylabs/how-to-scrape-google-scholar
A guide for extracting titles, authors, and citations from Google Scholar using Python and Oxylabs SERP Scraper API.
spekulatius/PHPScraper
A universal web-util for PHP.
oxylabs/how-to-scrape-amazon-prices
A code for extracting best-selling items, search results, and currently available deals from Amazon using Python and Oxylabs E-Commerce Scraper API.
jaebradley/basketball_reference_web_scraper
NBA Stats API via Basketball Reference
oxylabs/quick-start-guide
Python quick start guides to get the most out of Oxylabs' Web Scraper API free trial.
0x676e67/wreq
An ergonomic Rust HTTP Client with TLS fingerprint
austinoboyle/scrape-linkedin-selenium
`scrape_linkedin` is a python package that allows you to scrape personal LinkedIn profiles & company pages - turning the data into structured json.
AlexMathew/scrapple
A framework for creating semi-automatic web content extractors
shaikhsajid1111/social-media-profile-scrapers
Fetch user's data across social media
paulpierre/markdown-crawler
A multithreaded 🕸️ web crawler that recursively crawls a website and creates a 🔽 markdown file for each page, designed for LLM RAG
crwlrsoft/crawler
Library for Rapid (Web) Crawler and Scraper Development
passivebot/facebook-marketplace-scraper
This repository contains a script to scrape Facebook Marketplace data using Playwright, BeautifulSoup and Streamlit.
lewisdonovan/google-news-scraper
Lightweight scraper for Google News
oxylabs/web-unblocker
Free trial Web Unblocker - an AI-powered proxy solution that can bypass even the most sophisticated anti-bot systems.