website-scraper

There are 108 repositories under website-scraper topic.

  • website-scraper/node-website-scraper

    Download website to local directory (including all css, images, js, etc.)

    Language:JavaScript1.6k46199280
  • imthaghost/goclone

    Website Cloner - Utilizes powerful Go routines to clone websites to your computer within seconds.

    Language:Go1.4k2938295
  • josephlimtech/linkedin-profile-scraper-api

    🕵️‍♂️ LinkedIn profile scraper returning structured profile data in JSON.

    Language:TypeScript5781038156
  • Uscrapper

    z0m31en7/Uscrapper

    Uscrapper Vanta: Dive deeper into the web with this powerful open-source tool. Extract valuable insights with ease and efficiency, from both surface and deep web sources. Empower your data mining and analysis with Vanta's advanced capabilities. Fast, reliable, and user-friendly, Uscrapper Vanta is the ultimate choice for researchers and analysts.

    Language:Python5255753
  • website-scraper/website-scraper-puppeteer

    Plugin for website-scraper which returns html for dynamic websites using puppeteer

    Language:JavaScript326141980
  • Kooboo

    Kooboo/Kooboo

    CMS, WebSite, Application and Ecommerce Development Tool Using JavaScript

    Language:C#3172861100
  • html2rss-web

    html2rss/html2rss-web

    🕸 Generates RSS feeds of any website & serves to the web! Automatic scraping. Ready to use configs. Write your own. Rolling Docker releases for speedy updates.

    Language:Ruby9642711
  • erlange/wbm-dl

    Wayback Machine Downloader. 🔥 Download your entire archived websites from the Internet Archive Wayback Machine.

    Language:C#905716
  • OSINT-TECHNOLOGIES/dpulse

    DPULSE - Tool for complex approach to domain OSINT

    Language:Python852794
  • xarantolus/Collect

    A server to collect & archive websites that also supports video downloads

    Language:TypeScript8452311
  • LexiestLeszek/scrapeGPT

    ScrapeGPT is a RAG-based Telegram bot designed to scrape and analyze websites, then answer questions based on the scraped content. The bot utilizes Retrieval Augmented Generation and webscraping to return natural language answers to the user's queries.

    Language:Python764010
  • MLArtist/WebScraper

    Python-based web crawling script with randomized intervals, user-agent rotation, and proxy server IP rotation to outsmart website bots and prevent blocking.

    Language:Python722018
  • goClone

    shurco/goClone

    🌱 goClone - clone websites in seconds

    Language:Go66222
  • CRAKZOR/linkedin-post-automator

    Automatically curates and posts content to LinkedIn. It can optionally use web scraping to gather data, which is then fed to ChatGPT to craft engaging LinkedIn posts.

    Language:Python613719
  • website-scraper/node-website-scraper-phantom

    Plugin for website-scraper which returns html for dynamic websites using PhantomJS.

    Language:JavaScript595314
  • yuis-ice/jseval

    Evaluate JavaScript on a URL through headless Chrome browser.

    Language:JavaScript25301
  • vlmaier/marvel-snap-scrapr

    Scraper for https://marvelsnapzone.com to retrieve metadata of Marvel SNAP cards.

    Language:Python21278
  • faheel/file-extensions

    JSON collection of scraped file extensions, along with their description and type, from FileInfo.com

    Language:Python18516
  • jeanrauwers/followers-scraper-serverless

    Now you can keep track of your followers from YouTube, Instagram and Twitter accounts - Followers scraper API on AWS serverless

    Language:TypeScript18352
  • Ashwin-op/Email-Extractor

    A spider to crawl webpages

    Language:Python16204
  • cometolearnofficial/WebHawk

    Website Penetration Testing Tool With Dos Attack Feature

    Language:Python16201
  • dtflare/GPTparser

    Use GPTparser with your OpenAI API to scrape & parse files into structured JSON files.

    Language:Python13100
  • orangmuda/SECTOOL

    sᴇᴀʀᴄʜ ᴇɴɢɪɴᴇ sᴄʀᴀᴘᴇʀ ᴛᴏᴏʟ (ʙᴀsʜ)

    Language:Shell12104
  • website-scraper/website-scraper-existing-directory

    Plugin for website-scraper which allows to save resources to existing directory

    Language:JavaScript11425
  • dann1/ndown

    Bandwidth efficient scheduled downloads

    Language:Shell10310
  • dsc8x/node-scraper

    Scraping websites made easy! A minimalistic yet powerful tool for collecting data from websites.

    Language:JavaScript90
  • nigeld3v/Tumblr_Image_scrape

    Download ALL the images (JPEG/GIF/PNG) from any Tumblr website! This project employs Python3 and BeautifulSoup4 to scrape a Tumblr site (with the url provided by the user) to download, page by page, all the images from the Tumblr site's posts. Ideal for archiving other peoples' Tumblrs <3

    Language:Python9004
  • codassassin/website-url-scraper

    This is a website url scraper built using python.

    Language:Python8101
  • methylDragon/news-anaCrawler

    Article Dataset Generator for Internet News Sites. Crawls news sites, analyses them with NLP (sentiment analysis), and pushes to a database.

    Language:Jupyter Notebook8501
  • SamuraiPolix/openbible-verse-scraper

    This script scrapes the verses and references from an openbible.info page into a JSON file - if needed, we use bible-api.com to translate to another bible version.

    Language:Python8153
  • jasniec/WebsiteParser

    Simple library which parses web pages into objects usin attributes

    Language:C#7201
  • thenurhabib/linkext

    A python Script for automatically collect links from a web page.

    Language:Python7101
  • austinjoyal/website-scraper

    Scrapes any website to retrieve all hyperlinks from it in a matter of seconds. Scraping made easy!

    Language:Python6211
  • Sachinart/alexa-rank-checker

    Alexa Bulk Website Rank Checker PHP Script 2020 Latest! you can grab 200+ URL's website ranking at once!

    Language:CSS6312
  • ajaygithub2/yellow-pages-scraper

    There is a script for scraping yellowpages.com website for name, contact, address and link

    Language:Python5200
  • Hatim315/Manhua-Manga-Manhwa_Downloader

    This script downloads manhua, manga or manhwa and save them in a same name directory.

    Language:Python5101