/scrapegraph-sdk

🕷️ Official Scrapegraph API SDK: Effortlessly extract content from any website. AI-powered. 🤖 Hassle-free web scraping made simple.

Primary LanguageJupyter NotebookMIT LicenseMIT

🌐 ScrapeGraph AI SDKs

License Python SDK JavaScript SDK Documentation

Official SDKs for the ScrapeGraph AI API - Intelligent web scraping and search powered by AI. Extract structured data from any webpage or perform AI-powered web searches with natural language prompts.

Get your API key!

Features

  • 🤖 SmartScraper: Extract structured data from webpages using natural language prompts
  • 🔍 SearchScraper: AI-powered web search with structured results and reference URLs
  • 📝 Markdownify: Convert any webpage into clean, formatted markdown
  • 🕷️ SmartCrawler: Intelligently crawl and extract data from multiple pages
  • 🤖 AgenticScraper: Perform automated browser actions with AI-powered session management
  • 📄 Scrape: Convert webpages to HTML with JavaScript rendering and custom headers
  • Scheduled Jobs: Create and manage automated scraping workflows with cron scheduling
  • 💳 Credits Management: Monitor API usage and credit balance
  • 💬 Feedback System: Provide ratings and feedback to improve service quality

🚀 Quick Links

ScrapeGraphAI offers seamless integration with popular frameworks and tools to enhance your scraping capabilities. Whether you're building with Python or Node.js, using LLM frameworks, or working with no-code platforms, we've got you covered with our comprehensive integration options..

You can find more informations at the following link

Integrations:

📦 Installation

Python

pip install scrapegraph-py

JavaScript

npm install scrapegraph-js

🎯 Core Features

  • 🤖 AI-Powered Extraction & Search: Use natural language to extract data or search the web
  • 📊 Structured Output: Get clean, structured data with optional schema validation
  • 🔄 Multiple Formats: Extract data as JSON, Markdown, or custom schemas
  • High Performance: Concurrent processing and automatic retries
  • 🔒 Enterprise Ready: Production-grade security and rate limiting

🛠️ Available Endpoints

🤖 SmartScraper

Using AI to extract structured data from any webpage or HTML content with natural language prompts.

🔍 SearchScraper

Perform AI-powered web searches with structured results and reference URLs.

📝 Markdownify

Convert any webpage into clean, formatted markdown.

🕷️ SmartCrawler

Intelligently crawl and extract data from multiple pages with configurable depth and batch processing.

🤖 AgenticScraper

Perform automated browser actions on webpages using AI-powered agentic scraping with session management.

📄 Scrape

Convert webpages into HTML format with optional JavaScript rendering and custom headers.

⏰ Scheduled Jobs

Create, manage, and monitor scheduled scraping jobs with cron expressions and execution history.

💳 Credits

Check your API credit balance and usage.

💬 Feedback

Send feedback and ratings for scraping requests to help improve the service.

🌟 Key Benefits

  • 📝 Natural Language Queries: No complex selectors or XPath needed
  • 🎯 Precise Extraction: AI understands context and structure
  • 🔄 Adaptive Processing: Works with both web content and direct HTML
  • 📊 Schema Validation: Ensure data consistency with Pydantic/TypeScript
  • Async Support: Handle multiple requests efficiently
  • 🔍 Source Attribution: Get reference URLs for search results

💡 Use Cases

  • 🏢 Business Intelligence: Extract company information and contacts
  • 📊 Market Research: Gather product data and pricing
  • 📰 Content Aggregation: Convert articles to structured formats
  • 🔍 Data Mining: Extract specific information from multiple sources
  • 📱 App Integration: Feed clean data into your applications
  • 🌐 Web Research: Perform AI-powered searches with structured results

📖 Documentation

For detailed documentation and examples, visit:

💬 Support & Feedback

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.


Made with ❤️ by ScrapeGraph AI