A scraper class for Cloudflare Workers
The Scraper class is a tool designed for web scraping tasks in a Cloudflare Worker environment. It provides an intuitive API for extracting HTML, text, and attributes from web pages, with support for both single operations and chained multiple operations.
You can create a Scraper instance using the static create method:
const scraper = await Scraper.create('https://example.com');Alternatively, you can use the constructor and fetch method:
const scraper = new Scraper();
await scraper.fetch('https://example.com');To extract the HTML content of an element:
const html = await scraper.html('div.content');To extract the text content of an element:
const text = await scraper.text('p.description');You can also specify options:
const text = await scraper.text('p.description', { spaced: true });To extract an attribute from an element:
const href = await scraper.attribute('a.link', 'href');You can chain multiple operations together:
const results = await scraper.chain()
.html('div.content')
.text('p.description')
.attribute('a.link', 'href')
.getResult();For compatibility with the original API, you can use the querySelector method:
const text = await scraper.querySelector('p.description').getText();The Scraper class includes built-in error handling for common issues, including Cloudflare-specific errors. It's recommended to wrap your scraping operations in try-catch blocks:
try {
const scraper = await Scraper.create('https://example.com');
const content = await scraper.text('div.content');
} catch (error) {
console.error('Scraping failed:', error.message);
}