Pinned Repositories
actor-whitepaper
This whitepaper describes a new concept for building serverless microapps called Actors, which are easy to develop, share, integrate, and build upon. Actors are a reincarnation of the UNIX philosophy for programs running in the cloud.
apify-cli
Apify command-line interface helps you create, develop, build and run Apify actors, and manage the Apify cloud platform.
apify-mcp-server
Apify MCP server (tools for web scraping, data extraction, and automation)
crawlee
Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.
crawlee-python
Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.
fingerprint-suite
Browser fingerprinting tools for anonymizing your scrapers. Developed by Apify.
got-scraping
HTTP client made for scraping based on got.
impit
impit | rust library for browser impersonation
mcp-server-rag-web-browser
A MCP Server for the RAG Web Browser Actor
proxy-chain
Node.js implementation of a proxy server (think Squid) with support for SSL, authentication and upstream proxy chaining.
apify's Repositories
apify/super-scraper
Generic REST API for scraping websites. Drop-in replacement for ScrapingBee, ScrapingAnt, and ScraperAPI services. And it is open-source!
apify/xlsx-stream
JavaScript / Node.js library to stream data into an XLSX file
apify/idcac
I Don't Care About Cookies extension compiled for use with Playwright/Puppeteer
apify/apify-aggregator-template
🚀 Boilerplate and Starter for Next.js 12+, Tailwind CSS 3 and TypeScript ⚡️ Made with developer experience first: Next.js + TypeScript + ESLint + Prettier + Husky + Lint-Staged + Jest + Testing Library + Commitlint + VSCode + Netlify + PostCSS + Tailwind CSS
apify/apify-storage-local-js
Local emulation of the apify-client NPM package, which enables local use of Apify SDK.
apify/langchain
⚡ Building applications with LLMs through composability ⚡
apify/llama-hub
A library of data loaders for LLMs made by the community -- to be used with GPT Index and/or LangChain
apify/actor-html-to-pdf
Apify Actor to convert HTML string to pdf
apify/llama_index
LlamaIndex (GPT Index) is a project that provides a central interface to connect your LLM's with external data.
apify/node-html-to-text
Advanced html to text converter
apify/actor-scrapy-books-example
Example of Python Scrapy project. It scrapes book data from https://books.toscrape.com/.
apify/phantomjs
Scriptable Headless WebKit
apify/waw-file-specification
Contains specification of the Web Automation Workflow (WAW) file.
apify/airbyte
Airbyte is an open-source EL(T) platform that helps you replicate your data in your warehouses, lakes and databases.
apify/better-sqlite3
The fastest and simplest library for SQLite3 in Node.js.
apify/echo-standby-actor
An example Actor using Standby mode
apify/gecko-dev
A fork of the official repository of Mozilla Firefox, which is used for Apify's custom Firefox build.
apify/got
🌐 Human-friendly and powerful HTTP request library for Node.js
apify/langchainjs
apify/make-integrations-scraper
Scrape list of available integrations from Make
apify/pull-request-toolkit-action
The Github action that makes sure that each PR is correctly set up and has a milestone set.
apify/zapier-integrations-scraper
Scrape list of Zapier integrations from Zapier website
apify/keboola-ex-apify
Apify extractor for Keboola Connection
apify/actor-integration-tests
This Apify actor is used for integration tests.
apify/algolite
An Implementation of Algolia to emulate its REST API
apify/apify-docs-1
This project is the home of Apify's documentation.
apify/apify.github.io
The top-level organization Github Page.
apify/aws-ecr-action
This Action allows you to create Docker images and push into a ECR repository.
apify/release-pr-action
This action simplify creating of release PR
apify/s3-node-20-bug-repro
A temporary repository with reproduction of a bug with uploads to AWS S3