scraper
There are 8407 repositories under scraper topic.
huginn/huginn
Create agents that monitor and act on your behalf. Your agents are standing by!
cheeriojs/cheerio
The fast, flexible, and elegant library for parsing and manipulating HTML and XML.
iawia002/lux
👾 Fast and simple video download library and CLI tool written in Go
gocolly/colly
Elegant Scraper and Crawler Framework for Golang
NaiboWang/EasySpider
A visual no-code/code-free web crawler/spider易采集:一个可视化浏览器自动化测试/数据采集/爬虫软件,可以无代码图形化的设计和执行爬虫任务。别名:ServiceWrapper面向Web应用的智能化服务封装系统。
codelucas/newspaper
newspaper3k is a news, full-text, and article metadata extraction in Python 3. Advanced docs:
apify/crawlee
Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.
pwxcoo/chinese-xinhua
:orange_book: 中华新华字典数据库。包括歇后语,成语,词语,汉字。
guyueyingmu/avbook
AV 电影管理系统, avmoo , javbus , javlibrary 爬虫,线上 AV 影片图书馆,AV 磁力链接数据库,Japanese Adult Video Library,Adult Video Magnet Links - Japanese Adult Video Database
Evil0ctal/Douyin_TikTok_Download_API
🚀「Douyin_TikTok_Download_API」是一个开箱即用的高性能异步抖音、快手、TikTok、Bilibili数据爬取工具,支持API调用,在线批量解析及下载。
BruceDone/awesome-crawler
A collection of awesome web crawler,spider in different languages
alirezamika/autoscraper
A Smart, Automatic, Fast and Lightweight Web Scraper for Python
MontFerret/ferret
Declarative web scraping
yujiosaka/headless-chrome-crawler
Distributed crawler powered by Headless Chrome
go-rod/rod
A Devtools driver for web automation and scraping
madawei2699/myGPTReader
A community-driven way to read and chat with AI bots - powered by chatGPT.
fent/node-ytdl-core
YouTube video downloader in javascript.
JustAnotherArchivist/snscrape
A social networking service scraper in Python
IonicaBizau/scrape-it
🔮 A Node.js scraper for humans.
niespodd/browser-fingerprinting
Analysis of Bot Protection systems with available countermeasures 🚿. How to defeat anti-bot system 👻 and get around browser fingerprinting scripts 🕵️♂️ when scraping the web?
UltimaHoarder/UltimaScraper
Scrape all the media from an OnlyFans account - Updated regularly
JavScraper/Emby.Plugins.JavScraper
Emby/Jellyfin 的一个日本电影刮削器插件,可以从某些网站抓取影片信息。
aapatre/Automatic-Udemy-Course-Enroller-GET-PAID-UDEMY-COURSES-for-FREE
Do you want to LEARN NEW STUFF for FREE? Don't worry, with the power of web-scraping and automation, this script will find the necessary Udemy coupons & enroll you for PAID UDEMY COURSES, ABSOLUTELY FREE!
jae-jae/QueryList
:spider: The progressive PHP crawler framework! 优雅的渐进式PHP采集框架。
geziyor/geziyor
Geziyor, blazing fast web crawling & scraping framework for Go. Supports JS rendering.
meetDeveloper/freeDictionaryAPI
There was no free Dictionary API on the web when I wanted one for my friend, so I created one.
lucasjinreal/weibo_terminater
Final Weibo Crawler Scrap Anything From Weibo, comments, weibo contents, followers, anything. The Terminator
facundoolano/google-play-scraper
Node.js scraper to get data from Google Play
aliparlakci/bulk-downloader-for-reddit
Downloads and archives content from reddit
PaulMcInnis/JobFunnel
Scrape job websites into a single spreadsheet with no duplicates.
joeyism/linkedin_scraper
A library that scrapes Linkedin for user data
website-scraper/node-website-scraper
Download website to local directory (including all css, images, js, etc.)
extractus/article-extractor
To extract main article from given URL with Node.js
AhmadIbrahiim/Website-downloader
💡 Download the complete source code of any website (including all assets). [ Javascripts, Stylesheets, Images ] using Node.js
paulpierre/informer
A Telegram Mass Surveillance Bot in Python
edoardottt/cariddi
Take a list of domains, crawl urls and scan for endpoints, secrets, api keys, file extensions, tokens and more