crawl

There are 293 repositories under crawl topic.

  • kangvcar/InfoSpider

    INFO-SPIDER 是一个集众多数据源于一身的爬虫工具箱🧰,旨在安全快捷的帮助用户拿回自己的数据,工具代码开源,流程透明。支持数据源包括GitHub、QQ邮箱、网易邮箱、阿里邮箱、新浪邮箱、Hotmail邮箱、Outlook邮箱、京东、淘宝、支付宝、**移动、**联通、**电信、知乎、哔哩哔哩、网易云音乐、QQ好友、QQ群、生成朋友圈相册、浏览器浏览历史、12306、博客园、CSDN博客、开源**博客、简书。

    Language:Python8.1k182411.5k
  • 201206030/novel-plus

    novel-plus 是一个多端(PC、WAP)阅读 、功能完善的小说 CMS 系统。包括小说推荐、小说检索、小说排行、小说阅读、小说书架、小说评论、小说爬虫、会员中心、作家专区、充值订阅、新闻发布等功能。

    Language:Java4.3k5501.4k
  • wkunzhi/Python3-Spider

    Python爬虫实战 - 模拟登陆各大网站 包含但不限于:滑块验证、拼多多、美团、百度、bilibili、大众点评、淘宝,如果喜欢请start ❤️

    Language:Python3.3k95241k
  • ReaJason/xhs

    基于小红书 Web 端进行的请求封装。https://reajason.github.io/xhs/

    Language:Python1.8k17137409
  • coder-hxl/x-crawl

    Flexible Node.js AI-assisted crawler library

    Language:TypeScript1.8k1326109
  • ArchiveTeam/grab-site

    The archivist's web crawler: WARC output, dashboard for all crawls, dynamic ignore patterns

    Language:Python1.5k41208145
  • zhuweiyou/weixin-game-helper

    微信小游戏辅助合集(加减大师、包你懂我、大家来找茬腾讯版、头脑王者、好友画我、悦动音符、我最在行、星途WeGoing、猜画小歌、知乎答题王、腾讯**象棋、跳一跳、题多多黄金版)

    Language:JavaScript1.4k7646383
  • darbra/sperm

    浏览过的精彩逆向文章汇总,值得一看

  • LoseNine/Crack-JS-Spider

    JS破解逆向,破解JS反爬虫加密参数,已破解极验滑块w(2022.2.19),QQ音乐sign(2022.2.13),拼多多anti_content,boss直聘zp_token,知乎x-zse-96,酷狗kg_mid/dfid,唯品会mars_cid,**裁判文书网(2020-06-30更新),淘宝密码,天安保险登录,b站登录,房天下登录,WPS登录,微博登录,有道翻译,网易登录,微信公众号登录,空中网登录,今目标登录,学生信息管理系统登录,共赢金融登录,重庆科技资源共享平台登录,网易云音乐下载,一键解析视频链接,财联社登录。

    Language:JavaScript954266256
  • rugantio/fbcrawl

    A Facebook crawler

    Language:Python6844665226
  • liip/TheA11yMachine

    The A11y Machine is an automated accessibility testing tool which crawls and tests pages of any web application to produce detailed reports.

    Language:JavaScript627747167
  • markowanga/stweet

    Advanced python library to scrap Twitter (tweets, users) from unofficial API

    Language:Python610135668
  • philschmid/clipper.js

    HTML to Markdown converter and crawler.

    Language:TypeScript5914738
  • zkqiang/zhihu-login

    知乎模拟登录,支持提取验证码和保存 Cookies

    Language:Python3612620141
  • yaroslaff/nudecrawler

    Crawl telegra.ph searching for nudes!

    Language:Python3348727
  • darbra/geetest

    geetest,滑动验证码

    Language:Python312280168
  • zhangslob/awesome_crawl

    腾讯新闻、知乎话题、微博粉丝,Tumblr爬虫、斗鱼弹幕、妹子图爬虫、分布式设计等

    Language:Python296100109
  • spatie/laravel-site-search

    Create a full-text search index by crawling your site

    Language:PHP29251224
  • Pinkerton

    oppsec/Pinkerton

    🕵️ JavaScript file crawler and secret finder tool developed with Python

    Language:Python2875643
  • SpideyX

    RevoltSecurities/SpideyX

    SpideyX a multipurpose Web Penetration Testing tool with asynchronous concurrent performance with multiple mode and configurations.

    Language:Python1792332
  • scrapyman/data-api

    Scrapyman数据接口服务。提供:淘宝、小红书、同程旅行、京东、抖音(电商)、美团、抖音(视频)、快手、蒲公英、星图、拼多多、微信公众号、大众点评、哔哩哔哩、知乎、微博、贝壳、Bigo、Temu、Lazada、Shopee、SHEIN、百度指数、携程、Boss直聘、智联招聘、拉钩、今日头条、Facebook、Youtube、Instgram、Twitter。爬虫、采集、scrapy、接口、API。

  • adamdehaven/fetchurls

    A bash script to spider a site, follow links, and fetch urls (with built-in filtering) into a generated text file.

    Language:Shell13261344
  • dli98/geetest

    滑动验证码,希望对你们有所帮助❤️

    Language:Python13161237
  • ArchiveTeam/wget-lua

    Wget-AT is a modern Wget with Lua hooks, Zstandard (+dictionary) WARC compression and URL-agnostic deduplication.

    Language:C129201816
  • glouw/andvaranaut

    A dungeon crawler

    Language:C12113912
  • WwwwwyDev/crawlist

    A universal solution for web crawling lists. 抓取网页列表的通用解决方案

    Language:Python117101
  • monkey-soft/Scrapy_IPProxyPool

    免费 IP 代理池。Scrapy 爬虫框架插件

    Language:Python1034639
  • zhao94254/pspider

    一个简单的分布式爬虫框架

    Language:Python101318
  • jgravelle/groqcrawl

    GroqCrawl is a powerful and user-friendly web crawling and scraping application built with Streamlit and powered by PocketGroq. It provides an intuitive interface for extracting LLM friendly AI consumable content from websites, with support for single-page scraping, multi-page crawling, and site mapping.

    Language:Python941124
  • zongdeiqianxing/WebSecurityArticles

    爬取及整理Freebuf\安全客\先知\知道创宇等站点的”web安全“类优质文章

    Language:Python845020
  • zkqiang/crawler-chrome-extensions

    爬虫工程师常用的 Chrome 插件 | Chrome extensions used by crawler developer

  • mangenotwork/gathertool

    gathertool是golang脚本化开发库,目的是提高对应场景程序开发的效率;轻量级爬虫库,接口测试&压力测试库,DB操作库等。

    Language:Go532213
  • Swader/diffbot-php-client

    [Deprecated - Maintenance mode - use APIs directly please!] The official Diffbot client library

    Language:PHP5395320
  • chieund/crawl

    Teach daily is web crawl by GoLang from web dev.to, freecodecamp.com, medium.com, hashnode.com, logrocket.com,infoq.com

    Language:Go441118
  • Bin-Huang/NodeSpider

    [DEPRECATED] Simple, flexible, delightful web crawler/spider package

    Language:TypeScript37304
  • handong0123/cmd-toutiao

    摸鱼神器:在命令行中看今日头条

    Language:Python35103