webscraping
There are 8827 repositories under webscraping topic.
huginn/huginn
Create agents that monitor and act on your behalf. Your agents are standing by!
mendableai/firecrawl
🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
ScrapeGraphAI/Scrapegraph-ai
Python scraper based on AI
assafelovic/gpt-researcher
LLM based autonomous agent that conducts local and web research on any topic and generates a comprehensive report with citations.
pystardust/ani-cli
A cli tool to browse and play anime
lorien/awesome-web-scraping
List of libraries, tools and APIs for web scraping and data processing.
alirezamika/autoscraper
A Smart, Automatic, Fast and Lightweight Web Scraper for Python
getmaxun/maxun
🔥 Open-source no-code web data extraction platform. Turn websites to APIs and spreadsheets with no-code robots in minutes! [In Beta]
niespodd/browser-fingerprinting
Analysis of Bot Protection systems with available countermeasures 🚿. How to defeat anti-bot system 👻 and get around browser fingerprinting scripts 🕵️♂️ when scraping the web?
anaskhan96/soup
Web Scraper in Go, similar to BeautifulSoup
scrapoxy/scrapoxy
Scrapoxy is a super proxies manager that orchestrates all your proxies into one place, rather than spreading management across multiple scrapers. It manages IP rotation and fingerprinting, and smartly routes traffic to avoid bans.
D4Vinci/Scrapling
Undetectable, Lightning-Fast, and Adaptive Web Scraping for Python
TheWebScrapingClub/webscraping-from-0-to-hero
The web scraping open project repository aims to share knowledge and experiences about web scraping with Python
itsOwen/CyberScraper-2077
A Powerful web scraper powered by LLM | OpenAI, Gemini & Ollama
reworkd/tarsier
Vision utilities for web interaction agents 👀
jamesturk/scrapeghost
👻 Experimental library for scraping websites using OpenAI's GPT API.
requests-cache/requests-cache
Persistent HTTP cache for python requests
m8sec/CrossLinked
LinkedIn enumeration tool to extract valid employee names from an organization through search engine scraping
holgerd77/django-dynamic-scraper
Creating Scrapy scrapers via the Django admin interface
raznem/parsera
Lightweight library for scraping web-sites with LLMs
daijro/camoufox
🦊 Anti-detect browser
mov-cli/mov-cli
Watch everything from your terminal.
maxhumber/gazpacho
🥫 The simple, fast, and modern web scraping library
Skallwar/suckit
Suck the InTernet
benibela/xidel
Command line tool to download and extract data from HTML/XML pages or JSON-APIs, using CSS, XPath 3.0, XQuery 3.0, JSONiq or pattern matching. It can also create new or transformed XML/HTML/JSON documents.
chris-greening/instascrape
Powerful and flexible Instagram scraping library for Python, providing easy-to-use and expressive tools for accessing data programmatically
z0m31en7/Uscrapper
Uscrapper Vanta: Dive deeper into the web with this powerful open-source tool. Extract valuable insights with ease and efficiency, from both surface and deep web sources. Empower your data mining and analysis with Vanta's advanced capabilities. Fast, reliable, and user-friendly, Uscrapper Vanta is the ultimate choice for researchers and analysts.
wodsuz/EasyApplyJobsBot
A python bot to automatically apply all Linkedin,Glassdoor, etc Easy Apply jobs based on your preferences. Auto login, auto fill additional questions, apply automatically!
TheCodeMonks/NYTimes-App
🗽 A Simple Demonstration of the New York Times App 📱 using Jsoup web crawler with MVVM Architecture 🔥
adrianhajdin/pricewise
Dive into web scraping and build a Next.js 13 eCommerce price tracker within a single video that teaches you data scraping, cron jobs, sending emails, deployment, and more.
jchao01/TradingView-data-scraper
Extract price and indicator data from TradingView charts to create ML datasets
openaustralia/morph
Take the hassle out of web scraping
EZ-hwh/AutoScraper
Official implement of paper "AutoCrawler: A Progressive Understanding Web Agent for Web Crawler Generation"
ayushi7rawat/Youtube-Projects
This repository contains all the code I use in my YouTube tutorials.
roniemartinez/dude
dude uncomplicated data extraction: A simple framework for writing web scrapers using Python decorators
openzim/zimit
Make a ZIM file from any Web site and surf offline!