scraping-websites

There are 1991 repositories under scraping-websites topic.

MontFerret/ferret
Declarative web scraping
Language:Go5.9k 92 300311
Anorov/cloudflare-scrape
A Python module to bypass Cloudflare's anti-bot page.
Language:Python3.5k 124 397452
elixir-crawly/crawly
Crawly, a high-level web crawling & scraping framework for Elixir.
Language:Elixir1.1k 17 109120
gildas-lormeau/single-file-cli
CLI tool for saving a faithful copy of a complete web page in a single HTML file (based on SingleFile)
Language:JavaScript1k 10 142100
AmmeySaini/Edu-Mail-Generator
Generate Free Edu Mail(s) within minutes
Language:Python854 40 154406
josephlimtech/linkedin-profile-scraper-api
🕵️‍♂️ LinkedIn profile scraper returning structured profile data in JSON.
Language:TypeScript713 14 38174
Python-World/Python_and_the_Web
Build Bots, Scrape a website or use an API to solve a problem.
Language:Python694 22 181262
slotix/dataflowkit
Extract structured data from web sites. Web sites scraping.
Language:Go692 20 1381
csbun/thal
译文：Puppeteer 与 Chrome Headless —— 从入门到爬虫
Language:JavaScript656 17 849
KTZgraph/sarenka
OSINT tool - gets data from services like shodan, censys etc. in one app
Language:Python653 23 1486
spekulatius/PHPScraper
A universal web-util for PHP.
Language:PHP573 16 7175
avidLearnerInProgress/python-automation-scripts
Simple yet powerful automation stuffs.
Language:Python557 23 2165
oxylabs/quick-start-guide
Python quick start guides to get the most out of Oxylabs' Web Scraper API free trial.
516 1 13
baptisteArno/tinking
🧶 Extract data from any website without code, just clicks.
Language:TypeScript425 9 2428
unixfox/pupflare
A webpage proxy that request through Chromium (puppeteer) - can be used to bypass Cloudflare anti bot / anti ddos on any application (like curl)
Language:JavaScript421 14 3282
lkuffo/web-scraping
Más de 50 ejemplos de web scraping utilizando: Requests | Scrapy | Selenium | LXML | BeautifulSoup
Language:Python378 20 0216
crwlrsoft/crawler
Library for Rapid (Web) Crawler and Scraper Development
Language:PHP366 4 2213
kennethreitz/requests-html
Pythonic HTML Parsing for Humans™
Language:Python321 4 042
Go-phie/gophie
An Aggregator Engine for searching and downloading movies free - NO ADs!
Language:Go320 8 3430
driscoll42/ebayMarketAnalyzer
Scrape all eBay sold listings to determine average/median pricing, plot listings over time with trend lines, and extract to excel
Language:Python236 10 5930
m92vyas/llm-reader
Turn Webpage to LLM friendly input text. Similar to Firecrawl and Jina Reader API. Makes RAG, AI web scraping, image & webpage links extraction easy.
Language:Python228 2 116
e43b/Kemono-and-Coomer-Downloader
The Kemono and Coomer Downloader simplifies downloading posts from Kemono and Coomer websites, allowing users to download individual or multiple posts, including entire profiles. It offers advanced features like downloading attachments, videos, and automatically organizing files.
Language:Python223 3 2737
RuthGnz/SpyScrap
CLI and GUI for OSINT. Are you very exhibited on the Internet? Check it! Twitter, Tinder, Facebook, Google, Yandex, BOE. It uses facial recognition to provide more accurate results.
Language:Python201 9 029
DiegoCaraballo/Email-extractor
The main functionality is to extract all the emails from one or several URLs - La funcionalidad principal es extraer todos los correos electrónicos de una o varias Url
Language:Python193 14 1977
RyuzakiH/CloudflareSolverRe
Cloudflare Javascript & reCaptcha challenge (I'm Under Attack Mode or IUAM) solving / bypass .NET Standard library.
Language:C#193 15 3649
hridaydutta123/the-youtube-scraper
Download YouTube video description and video comments without using the YouTube API.
Language:Python173 5 325
Bishalsarang/Leetcode-Questions-Scraper
Scrape Algorithm Questions from leetcode and generate html and epub file
Language:Python161 2 647
johnbumgarner/newspaper3_usage_overview
This repository provides usage examples for the Python module Newspaper3k.
Language:Python148 4 116
yousefkotp/Movies-and-Series-Scraper
A console application to scrape a valid watching links for any movie or series with exact season and episode number, you can also download a whole season with one click.
Language:Python138 3 330
html2rss/html2rss
📰 Build RSS 2.0 feeds from websites (and JSON APIs) automatically or with a few CSS selectors.
Language:Ruby134 3 5410
alash3al/scraply
Scraply a simple dom scraper to fetch information from any html based website
Language:Go130 7 513
autogram-is/spidergram
Structural analysis tools for complex web sites
Language:TypeScript130 7 317
pavlovtech/WebReaper
Web scraper, crawler and parser in C#. Designed as simple, declarative and scalable web scraping solution.
Language:C#130 5 1132
fedecalendino/nintendeals
Library with a set of tools for scraping information about Nintendo games and its prices across all regions (NA, EU and JP).
Language:Python129 5 1418
voliveirajr/seleniumcrawler
An example using Selenium webdrivers for python and Scrapy framework to create a web scraper to crawl an ASP site
Language:Python128 9 045
fernandod1/Instagram-to-discord
Monitor instagram user account and automatically post new images to discord channel via a webhook. Working 2022!
Language:Python127 9 3059

scraping-websites

MontFerret/ferret

Anorov/cloudflare-scrape

elixir-crawly/crawly

gildas-lormeau/single-file-cli

AmmeySaini/Edu-Mail-Generator

josephlimtech/linkedin-profile-scraper-api

Python-World/Python_and_the_Web

slotix/dataflowkit

csbun/thal

KTZgraph/sarenka

spekulatius/PHPScraper

avidLearnerInProgress/python-automation-scripts

oxylabs/quick-start-guide

baptisteArno/tinking

unixfox/pupflare

lkuffo/web-scraping

crwlrsoft/crawler

kennethreitz/requests-html

Go-phie/gophie

driscoll42/ebayMarketAnalyzer

m92vyas/llm-reader

e43b/Kemono-and-Coomer-Downloader

RuthGnz/SpyScrap

DiegoCaraballo/Email-extractor

RyuzakiH/CloudflareSolverRe

hridaydutta123/the-youtube-scraper

Bishalsarang/Leetcode-Questions-Scraper

johnbumgarner/newspaper3_usage_overview

yousefkotp/Movies-and-Series-Scraper

html2rss/html2rss

alash3al/scraply

autogram-is/spidergram

pavlovtech/WebReaper

fedecalendino/nintendeals

voliveirajr/seleniumcrawler

fernandod1/Instagram-to-discord