SnakyScraper is a lightweight and Pythonic web scraping toolkit built on top of BeautifulSoup and Requests. It provides an elegant interface for extracting structured HTML and metadata from websites with clean, direct outputs.
Fast. Accurate. Snake-style scraping. ๐๐ฏ
- โ Extract metadata: title, description, keywords, author, and more
- โ Built-in support for Open Graph, Twitter Card, canonical, and CSRF tags
- โ
Extract HTML structures:
h1โh6,p,ul,ol,img, links - โ
Powerful
filter()method with class, ID, and tag-based selectors - โ
return_htmltoggle to return clean text or raw HTML - โ Simple return values: string, list, or dictionary
- โ Powered by BeautifulSoup4 and Requests
pip install snakyscraperRequires Python 3.7 or later
from snakyscraper import SnakyScraper
scraper = SnakyScraper("https://example.com")
# Get the page title
print(scraper.title()) # "Welcome to Example.com"
# Get meta description
print(scraper.description()) # "This is the example meta description."
# Get all <h1> elements
print(scraper.h1()) # ["Welcome", "Latest News"]
# Extract Open Graph metadata
print(scraper.open_graph()) # {"og:title": "...", "og:description": "...", ...}
# Custom filter: find all div.card elements and extract child tags
print(scraper.filter(
element="div",
attributes={"class": "card"},
multiple=True,
extract=["h1", "p", ".title", "#desc"]
))scraper.title()
scraper.description()
scraper.keywords()
scraper.keyword_string()
scraper.charset()
scraper.canonical()
scraper.content_type()
scraper.author()
scraper.csrf_token()
scraper.image()scraper.open_graph()
scraper.open_graph("og:title")
scraper.twitter_card()
scraper.twitter_card("twitter:title")scraper.h1()
scraper.h2()
scraper.h3()
scraper.h4()
scraper.h5()
scraper.h6()
scraper.p()scraper.ul()
scraper.ol()scraper.images()
scraper.image_details()scraper.links()
scraper.link_details()Use filter() to target specific DOM elements and extract nested content.
scraper.filter(
element="div",
attributes={"id": "main"},
multiple=False,
extract=[".title", "#description", "p"]
)scraper.filter(
element="div",
attributes={"class": "card"},
multiple=True,
extract=["h1", ".subtitle", "#meta"]
)The
extractargument accepts tag names, class selectors (e.g.,.title), or ID selectors (e.g.,#meta).
Output keys are automatically normalized:
.titleโclass__title,#metaโid__meta
You can also disable raw HTML output:
scraper.filter(
element="p",
attributes={"class": "dark-text"},
multiple=True,
return_html=False
)scraper.title()
# "Welcome to Example.com"
scraper.h1()
# ["Main Heading", "Another Title"]
scraper.open_graph("og:title")
# "Example OG Title"Contributions are welcome!
Found a bug or want to request a feature? Please open an issue or submit a pull request.
MIT License ยฉ 2025 โ SnakyScraper
Think of it as your Pythonic sniper โ targeting HTML content with precision and elegance.