website-scraper

There are 112 repositories under website-scraper topic.

website-scraper/node-website-scraper
Download website to local directory (including all css, images, js, etc.)
Language:JavaScript1.6k 45 201292
goclone-dev/goclone
Website Cloner - Utilizes powerful Go routines to clone websites to your computer within seconds.
Language:Go1.6k 30 40330
z0m31en7/Uscrapper
Uscrapper Vanta: Dive deeper into the web with this powerful open-source tool. Extract valuable insights with ease and efficiency, from both surface and deep web sources. Empower your data mining and analysis with Vanta's advanced capabilities. Fast, reliable, and user-friendly, Uscrapper Vanta is the ultimate choice for researchers and analysts.
Language:Python705 5 877
josephlimtech/linkedin-profile-scraper-api
🕵️‍♂️ LinkedIn profile scraper returning structured profile data in JSON.
Language:TypeScript688 14 38171
Kooboo/Kooboo
CMS, WebSite, Application and Ecommerce Development Tool Using JavaScript
Language:C#338 27 61103
website-scraper/website-scraper-puppeteer
Plugin for website-scraper which returns html for dynamic websites using puppeteer
Language:JavaScript330 13 1980
OSINT-TECHNOLOGIES/dpulse
DPULSE - Tool for complex approach to domain OSINT
Language:Python150 2 816
html2rss/html2rss-web
🕸 Generates RSS feeds of any website & serves to the web! Automatic scraping. Ready to use configs. Write your own. Rolling Docker releases for speedy updates.
Language:Ruby108 3 2913
erlange/wbm-dl
Wayback Machine Downloader. 🔥 Download your entire archived websites from the Internet Archive Wayback Machine.
Language:C#99 5 716
xarantolus/Collect
A server to collect & archive websites that also supports video downloads
Language:TypeScript86 5 2312
CRAKZOR/linkedin-post-automator
Automatically curates and posts content to LinkedIn. It can optionally use web scraping to gather data, which is then fed to ChatGPT to craft engaging LinkedIn posts.
Language:Python85 2 721
MLArtist/WebScraper
Python-based web crawling script with randomized intervals, user-agent rotation, and proxy server IP rotation to outsmart website bots and prevent blocking.
Language:Python83 1 019
shurco/goClone
🌱 goClone - clone websites in seconds
Language:Go83 2 24
LexiestLeszek/scrapeGPT
ScrapeGPT is a RAG-based Telegram bot designed to scrape and analyze websites, then answer questions based on the scraped content. The bot utilizes Retrieval Augmented Generation and webscraping to return natural language answers to the user's queries.
Language:Python82 4 012
website-scraper/node-website-scraper-phantom
Plugin for website-scraper which returns html for dynamic websites using PhantomJS.
Language:JavaScript59 4 314
vlmaier/marvel-snap-scrapr
Scraper for https://marvelsnapzone.com to retrieve metadata of Marvel SNAP cards.
Language:Python26 1 78
yuis-ice/jseval
Evaluate JavaScript on a URL through headless Chrome browser.
Language:JavaScript25 2 01
faheel/file-extensions
JSON collection of scraped file extensions, along with their description and type, from FileInfo.com
Language:Python19 4 16
jeanrauwers/followers-scraper-serverless
Now you can keep track of your followers from YouTube, Instagram and Twitter accounts - Followers scraper API on AWS serverless
Language:TypeScript19 2 52
Ashwin-op/Email-Extractor
A spider to crawl webpages
Language:Python16 2 04
cometolearnofficial/WebHawk
Website Penetration Testing Tool With Dos Attack Feature
Language:Python15 1 01
dtflare/GPTparser
Use GPTparser with your OpenAI API to scrape & parse files into structured JSON files.
Language:Python14 1 00
website-scraper/website-scraper-existing-directory
Plugin for website-scraper which allows to save resources to existing directory
Language:JavaScript13 3 25
orangmuda/SECTOOL
sᴇᴀʀᴄʜ ᴇɴɢɪɴᴇ sᴄʀᴀᴘᴇʀ ᴛᴏᴏʟ (ʙᴀsʜ)
Language:Shell12 1 04
dann1/ndown
Bandwidth efficient scheduled downloads
Language:Shell10 2 10
dsc8x/node-scraper
Scraping websites made easy! A minimalistic yet powerful tool for collecting data from websites.
Language:JavaScript10 3 00
nigeld3v/Tumblr_Image_scrape
Download ALL the images (JPEG/GIF/PNG) from any Tumblr website! This project employs Python3 and BeautifulSoup4 to scrape a Tumblr site (with the url provided by the user) to download, page by page, all the images from the Tumblr site's posts. Ideal for archiving other peoples' Tumblrs <3
Language:Python9 0 04
codassassin/website-url-scraper
This is a website url scraper built using python.
Language:Python8 1 01
methylDragon/news-anaCrawler
Article Dataset Generator for Internet News Sites. Crawls news sites, analyses them with NLP (sentiment analysis), and pushes to a database.
Language:Jupyter Notebook8 4 01
SamuraiPolix/openbible-verse-scraper
This script scrapes the verses and references from an openbible.info page into a JSON file - if needed, we use bible-api.com to translate to another bible version.
Language:Python8 1 53
arpitgoswami/webautomation
This repository includes simple traffic automation for blogger website that could be used by anybody by easily configuring and executing the PHP file. This automation can be used for website penetration testing and for various other ethical processes. The code uses many different types of user agents to make itself completely anonymous.
Language:PHP7 3 09
jasniec/WebsiteParser
Simple library which parses web pages into objects usin attributes
Language:C#7 1 01
thenurhabib/linkext
A python Script for automatically collect links from a web page.
Language:Python7 1 01
austinjoyal/website-scraper
Scrapes any website to retrieve all hyperlinks from it in a matter of seconds. Scraping made easy!
Language:Python6 1 11
Sachinart/alexa-rank-checker
Alexa Bulk Website Rank Checker PHP Script 2020 Latest! you can grab 200+ URL's website ranking at once!
Language:CSS6 2 12
HXCKYR/Osint-Eye
OSINT Tool for Email id (breached) | Website Scrapper | Instagram id | Linkedin Profile or .... ; )
Language:Python5 1 10

website-scraper

website-scraper/node-website-scraper

goclone-dev/goclone

z0m31en7/Uscrapper

josephlimtech/linkedin-profile-scraper-api

Kooboo/Kooboo

website-scraper/website-scraper-puppeteer

OSINT-TECHNOLOGIES/dpulse

html2rss/html2rss-web

erlange/wbm-dl

xarantolus/Collect

CRAKZOR/linkedin-post-automator

MLArtist/WebScraper

shurco/goClone

LexiestLeszek/scrapeGPT

website-scraper/node-website-scraper-phantom

vlmaier/marvel-snap-scrapr

yuis-ice/jseval

faheel/file-extensions

jeanrauwers/followers-scraper-serverless

Ashwin-op/Email-Extractor

cometolearnofficial/WebHawk

dtflare/GPTparser

website-scraper/website-scraper-existing-directory

orangmuda/SECTOOL

dann1/ndown

dsc8x/node-scraper

nigeld3v/Tumblr_Image_scrape

codassassin/website-url-scraper

methylDragon/news-anaCrawler

SamuraiPolix/openbible-verse-scraper

arpitgoswami/webautomation

jasniec/WebsiteParser

thenurhabib/linkext

austinjoyal/website-scraper

Sachinart/alexa-rank-checker

HXCKYR/Osint-Eye