scrapy

There are 3883 repositories under scrapy topic.

crawlab-team/crawlab
Distributed web crawler admin platform for spiders management regardless of languages and frameworks. 分布式爬虫管理平台，支持任何语言和框架
Language:Go11.9k 213 9921.9k
lining0806/PythonSpiderNotes
Python入门网络爬虫之精华版
Language:Python7.3k 385 112.2k
chyroc/WechatSogou
基于搜狗微信搜索的微信公众号爬虫接口
Language:Python6.1k 277 1911.7k
rmax/scrapy-redis
Redis-based components for Scrapy.
Language:Python5.6k 271 1951.6k
SpiderClub/haipproxy
:sparkling_heart: High available distributed ip proxy pool, powerd by Scrapy and Redis
Language:Python5.5k 204 97909
DropsDevopsOrg/ECommerceCrawlers
实战🐍多种网站、电商数据爬虫🕷。包含🕸：淘宝商品、微信公众号、大众点评、企查查、招聘网站、闲鱼、阿里任务、博客园、微博、百度贴吧、豆瓣电影、包图网、全景网、豆瓣音乐、某省药监局、搜狐新闻、机器学习文本采集、fofa资产采集、汽车之家、国家统计局、百度关键词收录数、蜘蛛泛目录、今日头条、豆瓣影评、携程、小米应用商店、安居客、途家民宿❤️❤️❤️。微信爬虫展示项目:
Language:Python5.2k 144 341.4k
nghuyong/WeiboSpider
持续维护的新浪微博采集工具🚀🚀🚀
Language:Python3.9k 68 318841
Gerapy/Gerapy
Distributed Crawler Management Framework Based on Scrapy, Scrapyd, Django and Vue.js
Language:Python3.5k 124 215645
Boris-code/feapder
🚀🚀🚀feapder is an easy to use, powerful crawler framework | feapder是一款上手简单，功能强大的Python爬虫框架。内置AirSpider、Spider、TaskSpider、BatchSpider四种爬虫解决不同场景的需求。且支持断点续爬、监控报警、浏览器渲染、海量数据去重等功能。更有功能强大的爬虫管理系统feaplat为其提供方便的部署及调度
Language:Python3.4k 35 186516
my8100/scrapydweb
Web app for Scrapyd cluster management, Scrapy log analysis & visualization, Auto packaging, Timer tasks, Monitor & Alert, and Mobile UI. Docs 文档 :point_right:
Language:Python3.3k 72 196582
wkunzhi/Python3-Spider
Python爬虫实战 - 模拟登陆各大网站包含但不限于：滑块验证、拼多多、美团、百度、bilibili、大众点评、淘宝，如果喜欢请start ❤️
Language:Python3.3k 95 241k
scrapy-plugins/scrapy-splash
Scrapy+Splash for JavaScript integration
Language:Python3.2k 124 256456
LuckyZXL2016/Movie_Recommend
基于Spark的电影推荐系统，包含爬虫项目、web网站、后台管理系统以及spark推荐系统
Language:Java2.9k 106 181k
DormyMo/SpiderKeeper
admin ui for scrapy/open source scrapinghub
Language:Python2.8k 106 90501
QianyanTech/Image-Downloader
Download images from Google, Bing, Baidu. 谷歌、百度、必应图片下载.
Language:Python2.3k 45 56576
librauee/Reptile
🏀 Python3 网络爬虫实战（部分含详细教程）猫眼腾讯视频豆瓣研招网微博笔趣阁小说百度热点 B站 CSDN 网易云阅读阿里文学百度股票今日头条微信公众号网易云音乐拉勾有道 unsplash 实习僧汽车之家英雄联盟盒子大众点评链家 LPL赛程台风梦幻西游、阴阳师藏宝阁天气牛客网百度文库睡前故事知乎 Wish
Language:Python1.7k 53 4516
TheWebScrapingClub/webscraping-from-0-to-hero
The web scraping open project repository aims to share knowledge and experiences about web scraping with Python
1.7k 31 099
kkoooqq/fakebrowser
🤖 Fake fingerprints to bypass anti-bot systems. Simulate mouse and keyboard operations to make behavior like a real person.
Language:JavaScript1.3k 41 0218
eliasdabbas/advertools
advertools - online marketing productivity and analysis tools
Language:Python1.3k 39 45234
scrapy-plugins/scrapy-playwright
🎭 Playwright integration for Scrapy
Language:Python1.3k 20 255143
istresearch/scrapy-cluster
This Scrapy project uses Redis and Kafka to create a distributed on demand scraping cluster.
Language:Python1.2k 107 161325
ityard/python-fxxk-spider
收集各种免费的 Python 爬虫项目
1.2k 16 2196
holgerd77/django-dynamic-scraper
Creating Scrapy scrapers via the Django admin interface
Language:Python1.2k 75 98307
juancarlospaco/faster-than-requests
Faster requests on Python 3
Language:Nim1.1k 18 14792
bytebuff/JSpider
JSpider会每周更新至少一个网站的JS解密方式，欢迎 Star，交流微信：13298307816
Language:JavaScript1.1k 58 15240
xingag/spider_python
python爬虫
Language:Python1.1k 33 8457
moyada/stealer
抖音、快手、火山、皮皮虾，视频去水印程序
Language:Python1.1k 14 86294
jonbakerfish/TweetScraper
TweetScraper is a simple crawler/spider for Twitter Search without using API
Language:Python1k 36 108314
vifreefly/kimuraframework
Kimurai is a modern web scraping framework written in Ruby which works out of box with Headless Chromium/Firefox, PhantomJS, or simple HTTP requests and allows to scrape and interact with JavaScript rendered websites
Language:Ruby1k 29 60158
eracle/linkedin
Linkedin Scraper using Selenium Web Driver, Chromium headless, Docker and Scrapy
Language:Python956 36 27141
clemfromspace/scrapy-selenium
Scrapy middleware to handle javascript pages using selenium
Language:Python951 20 92361
alanchn31/Data-Engineering-Projects
Personal Data Engineering Projects
Language:Jupyter Notebook946 8 0207
mtianyan/FunpySpiderSearchEngine
Word2vec 千人千面个性化搜索 + Scrapy2.3.0(爬取数据) + ElasticSearch7.9.1(存储数据并提供对外Restful API) + Django3.1.1 搜索
Language:Python938 44 18315
hellock/icrawler
A multi-thread crawler framework with many builtin image crawlers provided.
Language:Python877 23 96178
scrapinghub/scrapyrt
HTTP API for Scrapy spiders
Language:Python869 44 95160
MorvanZhou/easy-scraping-tutorial
Simple but useful Python web scraping tutorial code.
Language:Jupyter Notebook806 41 5545

scrapy

crawlab-team/crawlab

lining0806/PythonSpiderNotes

chyroc/WechatSogou

rmax/scrapy-redis

SpiderClub/haipproxy

DropsDevopsOrg/ECommerceCrawlers

nghuyong/WeiboSpider

Gerapy/Gerapy

Boris-code/feapder

my8100/scrapydweb

wkunzhi/Python3-Spider

scrapy-plugins/scrapy-splash

LuckyZXL2016/Movie_Recommend

DormyMo/SpiderKeeper

QianyanTech/Image-Downloader

librauee/Reptile

TheWebScrapingClub/webscraping-from-0-to-hero

kkoooqq/fakebrowser

eliasdabbas/advertools

scrapy-plugins/scrapy-playwright

istresearch/scrapy-cluster

ityard/python-fxxk-spider

holgerd77/django-dynamic-scraper

juancarlospaco/faster-than-requests

bytebuff/JSpider

xingag/spider_python

moyada/stealer

jonbakerfish/TweetScraper

vifreefly/kimuraframework

eracle/linkedin

clemfromspace/scrapy-selenium

alanchn31/Data-Engineering-Projects

mtianyan/FunpySpiderSearchEngine

hellock/icrawler

scrapinghub/scrapyrt

MorvanZhou/easy-scraping-tutorial