scrape

There are 610 repositories under scrape topic.

twintproject/twint
An advanced Twitter scraping & OSINT tool written in Python that doesn't use Twitter's API, allowing you to scrape a user's followers, following, Tweets and more while evading most API limitations.
Language:Python16.3k 329 1.2k2.8k
alirezamika/autoscraper
A Smart, Automatic, Fast and Lightweight Web Scraper for Python
Language:Python7k 121 67713
d60/twikit
Twitter API Scraper | Without an API key | Twitter Internal API | Free | Twitter scraper | Twitter Bot
Language:Python3.7k 38 294439
Anorov/cloudflare-scrape
A Python module to bypass Cloudflare's anti-bot page.
Language:Python3.5k 125 397452
microlinkhq/metascraper
Get unified metadata from websites using Open Graph, Microdata, RDFa, Twitter Cards, JSON-LD, HTML, and more.
Language:HTML2.6k 15 253182
any4ai/AnyCrawl
AnyCrawl 🚀: A Node.js/TypeScript crawler that turns websites into LLM-ready data and extracts structured SERP results from Google/Bing/Baidu/etc. Native multi-threading for bulk processing.
Language:TypeScript2.4k 15 25237
trevorhobenshield/twitter-api-client
Implementation of X/Twitter v1, v2, and GraphQL APIs
Language:Python1.9k 28 239249
Altimis/Scweet
A simple and unlimited twitter scraper : scrape tweets, likes, retweets, following, followers, user info, images...
Language:Python1.2k 17 151245
glebarez/cero
Scrape domain names from SSL certificates of arbitrary hosts
Language:Go686 8 991
markowanga/stweet
Advanced python library to scrap Twitter (tweets, users) from unofficial API
Language:Python614 13 5669
austinoboyle/scrape-linkedin-selenium
`scrape_linkedin` is a python package that allows you to scrape personal LinkedIn profiles & company pages - turning the data into structured json.
Language:HTML508 25 90167
drudge/n8n-nodes-puppeteer
n8n node for browser automation using Puppeteer
Language:TypeScript435 4 4971
unixfox/pupflare
A webpage proxy that request through Chromium (puppeteer) - can be used to bypass Cloudflare anti bot / anti ddos on any application (like curl)
Language:JavaScript420 14 3282
ScriptSmith/instamancer
Scrape Instagram's API with Puppeteer
Language:TypeScript408 19 3761
danieldotnl/ha-multiscrape
Home Assistant custom component for scraping (html, xml or json) multiple values (from a single HTTP request) with a separate sensor/attribute for each value. Support for (login) form-submit functionality.
Language:Python378 7 17718
yaroslaff/nudecrawler
Crawl telegra.ph searching for nudes!
Language:Python342 8 726
Anonyfox/elixir-scrape
Scrape any website, article or RSS/Atom Feed with ease!
Language:Elixir332 15 2041
ultralytics/google-images-download
Google/Bing Images Web Downloader
Language:Python314 4 1392
andrewstuart/goq
A declarative struct-tag-based HTML unmarshaling or scraping package for Go built on top of the goquery library
Language:Go268 9 1221
JMousqueton/ransomware.live
🏴‍☠️💰 Another Ransomware gang tracker
Language:Python265 10 15859
JaredLGillespie/proxyscrape
Python library for retrieving free proxies (HTTP, HTTPS, SOCKS4, SOCKS5).
Language:Python261 17 1655
evyatarmeged/Humanoid
Node.js package to bypass CloudFlare's anti-bot JavaScript challenges
Language:JavaScript234 3 1127
essamamdani/search-result-scraper-markdown
This project provides a powerful web scraping tool that fetches search results and converts them into Markdown format using FastAPI, SearXNG, and Browserless. It includes the capability to use proxies for web scraping and handles HTML content conversion to Markdown efficiently.
Language:Python226 1 113
rocketlaunchr/google-search
scrape google search results
Language:Go180 3 1233
oxylabs/scrape-google-python
In this tutorial, we showcase how to scrape public Google data with Python and Oxylabs API.
178 1 01
tegridydev/auto-md
Convert Files / Folders / GitHub Repos Into AI / LLM-ready Files
Language:Python162 10 125
Jimut123/jimutmap
API to get enormous amount of high resolution satellite images from satellites.pro quickly through multi-threading! create map your own map dataset. Bringing data to Humans.
Language:Python155 5 1218
meetyan/raise
A simple (and unofficial) GitHub Trending client that lives in your menubar.
Language:JavaScript148 2 01
html2rss/html2rss
📰 Build RSS 2.0 feeds from websites (and JSON APIs) automatically or with a few CSS selectors.
Language:Ruby134 3 5410
luengwaiban/instagram-python-scraper
A instagram scraper wrote in python. Similar to instagram-php-scraper.Usages are in example.py. Enjoy it!
Language:Python131 7 512
DrKain/scrape-youtube
A lightning fast package to scrape YouTube search results
Language:JavaScript121 7 5031
fefit/visdom
A library use jQuery like API for html parsing & node selecting & node mutation, suitable for web scraping and html confusion.
Language:Rust113 2 237
badoux/goscraper
Golang pkg to quickly return a preview of a webpage (title/description/images)
Language:Go110 1 442
jgravelle/groqcrawl
GroqCrawl is a powerful and user-friendly web crawling and scraping application built with Streamlit and powered by PocketGroq. It provides an intuitive interface for extracting LLM friendly AI consumable content from websites, with support for single-page scraping, multi-page crawling, and site mapping.
Language:Python99 2 130
ndgigliotti/shopify-spy
Extract structured data from Shopify websites.
Language:Python99 4 350
SilentDemonSD/FZBypassBot
A Elegant Fast Multi Threaded Bypass Bot for Bigger Deeds. Try Now !!
Language:Python96 2 10166

scrape

twintproject/twint

alirezamika/autoscraper

d60/twikit

Anorov/cloudflare-scrape

microlinkhq/metascraper

any4ai/AnyCrawl

trevorhobenshield/twitter-api-client

Altimis/Scweet

glebarez/cero

markowanga/stweet

austinoboyle/scrape-linkedin-selenium

drudge/n8n-nodes-puppeteer

unixfox/pupflare

ScriptSmith/instamancer

danieldotnl/ha-multiscrape

yaroslaff/nudecrawler

Anonyfox/elixir-scrape

ultralytics/google-images-download

andrewstuart/goq

JMousqueton/ransomware.live

JaredLGillespie/proxyscrape

evyatarmeged/Humanoid

essamamdani/search-result-scraper-markdown

rocketlaunchr/google-search

oxylabs/scrape-google-python

tegridydev/auto-md

Jimut123/jimutmap

meetyan/raise

html2rss/html2rss

luengwaiban/instagram-python-scraper

DrKain/scrape-youtube

fefit/visdom

badoux/goscraper

jgravelle/groqcrawl

ndgigliotti/shopify-spy

SilentDemonSD/FZBypassBot