crawler-engine

There are 51 repositories under crawler-engine topic.

6677-ai/tap4-ai-crawler
The crawler opened source by tap4.ai
Language:Python202 2 1150
nuhmanpk/WebScrapper
Simple and powerfull all in one Telegram Bot to scrap / crawl webpages using Requests, html5lib and Beautifulsoup
Language:Python139 4 681
RevoltSecurities/SpideyX
SpideyX a multipurpose Web Penetration Testing tool with asynchronous concurrent performance with multiple mode and configurations.
Language:Python118 2 323
namhong1412/browser-clone-web
Use browser to re-copy a web page
Language:Python22 2 16
bkeepers/spiderman
your friendly neighborhood web crawler
Language:Ruby18 5 04
fooock/robots.txt
:robot: robots.txt as a service. Crawls robots.txt files, downloads and parses them to check rules through an API
Language:Java16 3 312
web-extractors/arachnid-seo-js
Web crawler for extracting internal site links info for SEO auditing & optimization purposes
Language:TypeScript15 3 02
Sobak/scrawler
Declarative, scriptable web robot (crawler) and scrapper
Language:PHP10 3 01
wefindx/metadrive
Generic Interfaces to Addressable Objects
Language:Python9 5 81
wetrycode/tegenaria
Tegenaria is a crawler framework based on golang
Language:Go9 2 11
crawlbase/crawlbase-ruby
Fast Crawlbase API crawling library
Language:Ruby8 1 00
BaseMax/NetPHP
Useful functions for connecting to the network in the PHP based applications.
Language:PHP7 3 01
lichang98/visualize_spider
基于Spring Boot、Scrapy 的可视化爬虫配置与管理
Language:HTML7 1 12
ShiqinHuo/wuhan_house_price_crawler
武汉东湖高新片区光谷&软件园二手房房价爬虫。data source: 房天下
Language:Jupyter Notebook7 1 02
spekulatius/spatie-crawler-cached-queue-example
Example to demonstrate the usage of cached queues across multiple requests.
Language:PHP7 4 00
supernebula/shark
Shark (Plunder)可配置、插件化的爬虫引擎，二次开发框架。Configurable, pluginable crawler engine, secondary development framework.
Language:C#5 2 03
hseghetti/simple-crawler
Simple crawler using apache nutch and elasticsearch
Language:Shell4 2 01
MCStreetguy/Crawler
An advanced web-crawler written in PHP.
Language:PHP4 3 00
andrrff/BugSearch
BugSearch é um motor de pesquisa de páginas indexadas pelo crawler BugSearch.Crawler. O projeto é dividido em duas partes: o lado do Bot (Bot side) e o lado do Cliente (Client side).
Language:C#3 1 50
Colaplusice/zhihu
数据挖掘实验，抓取用户信息并且进行聚类等处理
Language:Jupyter Notebook3 0 00
its-my-data/android-crawler-engine
An Android app crawling framework, making automatic crawling mobile apps super easy! (if possible, iOS will be supported after Android version)
3 2 00
KonghaYao/jspider
This is a JavaScript toolkit for browser crawler testing.
Language:JavaScript3 1 20
plugnsearch/plugnsearch
The only real pluggable crawler / spider / webcrawler to search the web for stuff you need to know.
Language:JavaScript3 2 02
takadev15/onecrawl-rs
Blazingly Fast, High Performant, Scalable Web Crawler Engine 💨
Language:Rust3 1 00
johnvanderton/flysh
HTML type document parser based on jQuery and JSDOM
Language:TypeScript2 1 770
Keerthivasan13/Targeted_Advertising_Google_AdSense
Hybrid E-Marketing using Web Page Mining for Website Monetization
Language:TSQL2 1 04
kingzbauer/scraperlang
A DSL aimed at making writing web scrapers/crawlers a breeze
Language:Go2 3 00
rihenperry/whirlpool-urlfrontier
mercator scheme/rate-limiting/scheduling part of whirlpool project; handles crawler priority and politeness
Language:Java2 0 00
robincloud/robinbot
robin micro web crawling engine with nodejs
Language:JavaScript2 3 00
rrmerugu/trawler
A data gathering/trawling framework to search and get information from web sources like bing
Language:Python2 3 52
runjia1987/crawler-engine
crawler-engine with HTTP, proxy, JS-Java Interoperability, MQ task consumption, dynamic crawler scripts execution. support deployment in distribution style.
Language:Java2 2 01
eyazdpour/DirectoryCrawler
Simple crawler for a directory (on Windows) which return all possible information about whatever is in that given directory
Language:C++1 1 40
MaximeGuinard/Gtool-projects-crawler-seo
🤖 A Google extension that facilitates project management with various tools
Language:HTML1 1 0
paganini2008/greenfinger
A high-performance distributed web crawling framework based on SpringBoot framework. It provides rich APIs to customize business and easily embedded your system.
Language:Java1 1 01
setulparmar/Landslide-Detection-and-Prediction
This project named "Landslide Detection and Prediction" was done during my summer internship under Visiting Associate Prof. Gagan Raj Gupta at IIT - Bhilai.
Language:Jupyter Notebook1 1 01
ShubhamThakurela/global-social-media-ms
Functionality to Extract Social data.
Language:Python1 1 00

crawler-engine

6677-ai/tap4-ai-crawler

nuhmanpk/WebScrapper

RevoltSecurities/SpideyX

namhong1412/browser-clone-web

bkeepers/spiderman

fooock/robots.txt

web-extractors/arachnid-seo-js

Sobak/scrawler

wefindx/metadrive

wetrycode/tegenaria

crawlbase/crawlbase-ruby

BaseMax/NetPHP

lichang98/visualize_spider

ShiqinHuo/wuhan_house_price_crawler

spekulatius/spatie-crawler-cached-queue-example

supernebula/shark

hseghetti/simple-crawler

MCStreetguy/Crawler

andrrff/BugSearch

Colaplusice/zhihu

its-my-data/android-crawler-engine

KonghaYao/jspider

plugnsearch/plugnsearch

takadev15/onecrawl-rs

johnvanderton/flysh

Keerthivasan13/Targeted_Advertising_Google_AdSense

kingzbauer/scraperlang

rihenperry/whirlpool-urlfrontier

robincloud/robinbot

rrmerugu/trawler

runjia1987/crawler-engine

eyazdpour/DirectoryCrawler

MaximeGuinard/Gtool-projects-crawler-seo

paganini2008/greenfinger

setulparmar/Landslide-Detection-and-Prediction

ShubhamThakurela/global-social-media-ms