spider

There are 2757 repositories under spider topic.

  • NaiboWang/EasySpider

    A visual no-code/code-free web crawler/spider易采集:一个可视化浏览器自动化测试/数据采集/爬虫软件,可以无代码图形化的设计和执行爬虫任务。别名:ServiceWrapper面向Web应用的智能化服务封装系统。

    Language:JavaScript27.5k1903733.2k
  • gocolly/colly

    Elegant Scraper and Crawler Framework for Golang

    Language:Go22.5k3305441.7k
  • jhao104/proxy_pool

    Python ProxyPool for web spider

    Language:Python20.5k4455995k
  • shengqiangzhang/examples-of-web-crawlers

    一些非常有趣的python爬虫例子,对新手比较友好,主要爬取淘宝、天猫、微信、微信读书、豆瓣、QQ等网站。(Some interesting examples of python crawlers that are friendly to beginners. )

    Language:Python13.6k3471133.8k
  • crawlab

    crawlab-team/crawlab

    Distributed web crawler admin platform for spiders management regardless of languages and frameworks. 分布式爬虫管理平台,支持任何语言和框架

    Language:Go10.9k2118911.8k
  • s0md3v/Photon

    Incredibly fast crawler designed for OSINT.

    Language:Python10.6k3221041.5k
  • guyueyingmu/avbook

    AV 电影管理系统, avmoo , javbus , javlibrary 爬虫,线上 AV 影片图书馆,AV 磁力链接数据库,Japanese Adult Video Library,Adult Video Magnet Links - Japanese Adult Video Database

    Language:PHP9.3k3421362k
  • ssssssss-team/spider-flow

    新一代爬虫平台,以图形化方式定义爬虫流程,不写代码即可完成爬虫。

    Language:Java9.2k95421.8k
  • kangvcar/InfoSpider

    INFO-SPIDER 是一个集众多数据源于一身的爬虫工具箱🧰,旨在安全快捷的帮助用户拿回自己的数据,工具代码开源,流程透明。支持数据源包括GitHub、QQ邮箱、网易邮箱、阿里邮箱、新浪邮箱、Hotmail邮箱、Outlook邮箱、京东、淘宝、支付宝、**移动、**联通、**电信、知乎、哔哩哔哩、网易云音乐、QQ好友、QQ群、生成朋友圈相册、浏览器浏览历史、12306、博客园、CSDN博客、开源**博客、简书。

    Language:Python7.6k180391.5k
  • andeya/pholcus

    Pholcus is a distributed high-concurrency crawler software written in pure golang

    Language:Go7.5k455891.7k
  • Douyin_TikTok_Download_API

    Evil0ctal/Douyin_TikTok_Download_API

    🚀「Douyin_TikTok_Download_API」是一个开箱即用的高性能异步抖音、快手、TikTok、Bilibili数据爬取工具,支持API调用,在线批量解析及下载。

    Language:Python7.4k583701.2k
  • luyishisi/Anti-Anti-Spider

    越来越多的网站具有反爬虫特性,有的用图片隐藏关键数据,有的使用反人类的验证码,建立反反爬虫的代码仓库,通过与不同特性的网站做斗争(无恶意)提高技术。(欢迎提交难以采集的网站)(因工作原因,项目暂停)

    Language:Python7.3k448352.2k
  • bda-research/node-crawler

    Web Crawler/Spider for NodeJS + server-side jQuery ;-)

    Language:JavaScript6.6k256303879
  • lorien/awesome-web-scraping

    List of libraries, tools and APIs for web scraping and data processing.

    Language:Makefile6.4k2297775
  • BruceDone/awesome-crawler

    A collection of awesome web crawler,spider in different languages

  • SpiderClub/haipproxy

    :sparkling_heart: High available distributed ip proxy pool, powerd by Scrapy and Redis

    Language:Python5.4k20697916
  • tophubs/TopList

    今日热榜,一个获取各大热门网站热门头条的聚合网站,使用Go语言编写,多协程异步快速抓取信息,预览:https://mo.fish

    Language:Go4.7k10876952
  • niespodd/browser-fingerprinting

    Analysis of Bot Protection systems with available countermeasures 🚿. How to defeat anti-bot system 👻 and get around browser fingerprinting scripts 🕵️‍♂️ when scraping the web?

    Language:JavaScript3.9k688221
  • 201206030/novel-plus

    novel-plus 是一个多端(PC、WAP)阅读 、功能完善的小说 CMS 系统。包括小说推荐、小说检索、小说排行、小说阅读、小说书架、小说评论、小说爬虫、会员中心、作家专区、充值订阅、新闻发布等功能。

    Language:Java3.6k5401.3k
  • elliotgao2/toapi

    Every web site provides APIs.

    Language:Python3.5k7754236
  • ihmily/DouyinLiveRecorder

    可循环值守和多人录制的直播录制软件,支持抖音、TikTok、快手、虎牙、斗鱼、B站、小红书、pandatv、afreecatv、flextv、popkontv、twitcasting、winktv、百度、微博、酷狗、花椒、流星等平台直播录制,抓取多平台直播源地址

    Language:Python3.5k22392405
  • Gerapy/Gerapy

    Distributed Crawler Management Framework Based on Scrapy, Scrapyd, Django and Vue.js

    Language:Python3.2k124211627
  • wechatsync/Wechatsync

    一键同步文章到多个内容平台,支持今日头条、WordPress、知乎、简书、掘金、CSDN、typecho各大平台,一次发布,多平台同步发布。解放个人生产力

    Language:JavaScript3.2k2495489
  • my8100/scrapydweb

    Web app for Scrapyd cluster management, Scrapy log analysis & visualization, Auto packaging, Timer tasks, Monitor & Alert, and Mobile UI. DEMO :point_right:

    Language:Python3k75181548
  • wkunzhi/Python3-Spider

    Python爬虫实战 - 模拟登陆各大网站 包含但不限于:滑块验证、拼多多、美团、百度、bilibili、大众点评、淘宝,如果喜欢请start ❤️

    Language:Python2.9k95241k
  • core

    JAVClub/core

    🔞 JAVClub - 让你的大姐姐不再走丢

    Language:JavaScript2.8k10632342
  • CharlesPikachu/DecryptLogin

    DecryptLogin: APIs for loginning some websites by using requests.

    Language:Python2.8k6279748
  • jumper2014/lianjia-beike-spider

    链家网和贝壳网房价爬虫,采集北京上海广州深圳等21个**主要城市的房价数据(小区,二手房,出租房,新房),稳定可靠快速!支持csv,MySQL, MongoDB,Excel, json存储,支持Python2和3,图表展示数据,注释丰富 ,点星支持,仅供学习参考,请勿用于商业用途,后果自负。

    Language:Python2.7k9542685
  • DormyMo/SpiderKeeper

    admin ui for scrapy/open source scrapinghub

    Language:Python2.7k10790502
  • feapder

    Boris-code/feapder

    🚀🚀🚀feapder is an easy to use, powerful crawler framework | feapder是一款上手简单,功能强大的Python爬虫框架。内置AirSpider、Spider、TaskSpider、BatchSpider四种爬虫解决不同场景的需求。且支持断点续爬、监控报警、浏览器渲染、海量数据去重等功能。更有功能强大的爬虫管理系统feaplat为其提供方便的部署及调度

    Language:Python2.7k38160458
  • shiyanhui/dht

    BitTorrent DHT Protocol && DHT Spider.

    Language:Go2.7k12259491
  • DedSecInside/TorBot

    Dark Web OSINT Tool

    Language:Python2.7k101102508
  • wnma3mz/wechat_articles_spider

    微信公众号文章的爬虫

    Language:Python2.7k7352692
  • jae-jae/QueryList

    :spider: The progressive PHP crawler framework! 优雅的渐进式PHP采集框架。

    Language:PHP2.6k76160443
  • howie6879/owllook

    owllook-小说搜索引擎

    Language:Python2.6k10888747