crawler

There are 7001 repositories under crawler topic.

  • arachni

    Web Application Security Scanner Framework

    Language:Ruby3.8k
  • trafilatura

    Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XML

    Language:Python3.8k
  • weibo-crawler

    新浪微博爬虫,用python爬取新浪微博数据,并下载微博图片和微博视频

    Language:Python3.6k
  • toapi

    Every web site provides APIs.

    Language:Python3.5k
  • puppeteer-sharp

    Headless Chrome .NET API

    Language:C#3.5k
  • work_crawler

    Download comics novels 小说漫画下载工具 小説漫画のダウンローダ 小說漫畫下載:腾讯漫画 大角虫漫画 有妖气 咪咕 SF漫画 哦漫画 看漫画 漫画柜 汗汗酷漫 動漫伊甸園 快看漫画 微博动漫 733动漫网 大古漫画网 漫画DB 無限動漫 動漫狂 卡推漫画 动漫之家 动漫屋 古风漫画网 36漫画网 亲亲漫画网 乙女漫画 webtoons 咚漫 ニコニコ静画 ComicWalker ヤングエースUP モアイ pixivコミック サイコミ;アルファポリス カクヨム ハーメルン 小説家になろう 起点中文网 八一中文网 顶点小说 落霞小说网 努努书坊 笔趣阁→epub.

    Language:JavaScript3.2k
  • RED_HAWK

    All in one tool for Information Gathering, Vulnerability Scanning and Crawling. A must have tool for all penetration testers

    Language:PHP3.1k
  • Python3-Spider

    Python爬虫实战 - 模拟登陆各大网站 包含但不限于:滑块验证、拼多多、美团、百度、bilibili、大众点评、淘宝,如果喜欢请start ❤️

    Language:Python3.1k
  • feapder

    feapder

    🚀🚀🚀feapder is an easy to use, powerful crawler framework | feapder是一款上手简单,功能强大的Python爬虫框架。内置AirSpider、Spider、TaskSpider、BatchSpider四种爬虫解决不同场景的需求。且支持断点续爬、监控报警、浏览器渲染、海量数据去重等功能。更有功能强大的爬虫管理系统feaplat为其提供方便的部署及调度

    Language:Python3.1k
  • TorBot

    Dark Web OSINT Tool

    Language:Python3k
  • crawlergo

    A powerful browser crawler for web vulnerability scanners

    Language:Go2.9k
  • lianjia-beike-spider

    链家网和贝壳网房价爬虫,采集北京上海广州深圳等21个**主要城市的房价数据(小区,二手房,出租房,新房),稳定可靠快速!支持csv,MySQL, MongoDB,Excel, json存储,支持Python2和3,图表展示数据,注释丰富 ,点星支持,仅供学习参考,请勿用于商业用途,后果自负。

    Language:Python2.9k
  • DecryptLogin

    DecryptLogin: APIs for loginning some websites by using requests.

    Language:Python2.8k
  • owllook

    owllook-小说搜索引擎

    Language:Python2.7k
  • QueryList

    :spider: The progressive PHP crawler framework! 优雅的渐进式PHP采集框架。

    Language:PHP2.7k
  • GoogleScraper

    A Python module to scrape several search engines (like Google, Yandex, Bing, Duckduckgo, ...). Including asynchronous networking support.

    Language:HTML2.7k
  • geziyor

    Geziyor, blazing fast web crawling & scraping framework for Go. Supports JS rendering.

    Language:Go2.6k
  • gospider

    Gospider - Fast web spider written in Go

    Language:Go2.6k
  • crawler

    An easy to use, powerful crawler implemented in PHP. Can execute Javascript.

    Language:PHP2.6k
  • NGCBot

    一个基于✨HOOK机制的微信机器人,支持🌱安全新闻定时推送【FreeBuf,先知,安全客,奇安信攻防社区】,👯Kfc文案,⚡漏洞查询,⚡手机号归属地查询,⚡知识库查询,🎉星座查询,⚡天气查询,🌱摸鱼日历,⚡微步威胁情报查询, 🐛视频,⚡图片,👯帮助菜单。📫 支持积分功能,⚡支持自动拉人,,🌱自动群发,👯Ai回复,⚡视频号解析,😄自定义程度丰富,小白也可轻松上手!

    Language:Python2.6k
  • gecco

    Easy to use lightweight web crawler(易用的轻量化网络爬虫)

    Language:Java2.5k
  • grab

    Web Scraping Framework

    Language:Python2.4k
  • google-play-scraper

    Node.js scraper to get data from Google Play

    Language:JavaScript2.4k
  • FinalRecon

    FinalRecon

    All In One Web Recon

    Language:Python2.3k
  • abot

    Cross Platform C# web crawler framework built for speed and flexibility. Please star this project! +1.

    Language:C#2.3k
  • news-please

    news-please

    news-please - an integrated web crawler and information extractor for news that just works

    Language:Python2.1k
  • Leaked-GPTs

    Leaked-GPTs

    Leaked GPTs Prompts Bypass the 25 message limit or to try out GPTs without a Plus subscription.

    Language:Python2.1k
  • gain

    Web crawling framework based on asyncio.

    Language:Python2k
  • gocrawl

    Polite, slim and concurrent web crawler.

    Language:Go2k
  • Crawler-Detect

    🕷 CrawlerDetect is a PHP class for detecting bots/crawlers/spiders via the user agent

    Language:PHP2k
  • rendora

    dynamic server-side rendering using headless Chrome to effortlessly solve the SEO problem for modern javascript websites

    Language:Go2k
  • DXY-COVID-19-Crawler

    2019新型冠状病毒疫情实时爬虫及API | COVID-19/2019-nCoV Realtime Infection Crawler and API

    Language:Python2k
  • skycaiji

    蓝天采集器是一款开源免费的爬虫系统,仅需点选编辑规则即可采集数据,可运行在本地、虚拟主机或云服务器中,几乎能采集所有类型的网页,无缝对接各类CMS建站程序,免登录实时发布数据,全自动无需人工干预!是网页大数据采集软件中完全跨平台的云端爬虫系统

    Language:PHP2k
  • mdcx

    Movie metadata scraper

    Language:Python2k
  • vulnx

    vulnx 🕷️ an intelligent Bot, Shell can achieve automatic injection, and help researchers detect security vulnerabilities CMS system. It can perform a quick CMS security detection, information collection (including sub-domain name, ip address, country information, organizational information and time zone, etc.) and vulnerability scanning.

    Language:Python1.9k
  • PSpider

    简单易用的Python爬虫框架,QQ交流群:597510560

    Language:Python1.8k