/puppeteer-search-scraper

πŸ•΅οΈβ€β™‚οΈ Scrape search pages using Puppeteer

Primary LanguageJavaScriptMIT LicenseMIT

puppeteer-search-scraper

Support me on Patreon Buy me a book PayPal Ask me anything Version Downloads Get help on Codementor

Buy Me A Coffee

Scrape Search Engines using Puppeteer

☁️ Installation

# Using npm
npm install --save puppeteer-search-scraper

# Using yarn
yarn add puppeteer-search-scraper

πŸ“‹ Example

const SearchScraper = require("puppeteer-search-scraper");

const headless = false
SearchScraper.configure([
    {
        name: "GoogleCom"
      , debugDir: __dirname + "/public/GoogleCom"
      , searchUrl: "http://google.com/webhp?num=100"
      , limit: 100
      , selectors: SearchScraper.Selectors.GOOGLE_MOBILE
      , headless
      , device: "iPhone X"
    }
  , {
        name: "GoogleCoUk"
      , debugDir: __dirname + "/public/GoogleCoUk"
      , searchUrl: "http://google.co.uk/webhp?num=100"
      , limit: 100
      , selectors: SearchScraper.Selectors.GOOGLE
      , headless
    }
  , {
        name: "GoogleCoAu"
      , debugDir: __dirname + "/public/GoogleComAu"
      , searchUrl: "http://google.com.au/webhp?num=100"
      , limit: 100
      , selectors: SearchScraper.Selectors.GOOGLE
      , headless
    }
  , {
        name: "Bing"
      , debugDir: __dirname + "/public/Bing"
      , searchUrl: "http://bing.com/"
      , limit: 30
      , selectors: SearchScraper.Selectors.BING
      , headless
    }
])

const QUERY = "who killed kennedy";

(async () => {
	//console.log(">>>> Google.com")
	//console.log(await SearchScraper.search(QUERY, { engine: "GoogleCom" }))

	//console.log(">>>> Google.co.uk")
	//console.log(await SearchScraper.search(QUERY, { engine: "GoogleCoUk" }))

	//console.log(">>>> Google.com.au")
  	//console.log(await SearchScraper.search(QUERY, { engine: "GoogleCoAu" }))

	console.log(">>>> Bing.com")
  	console.log(await SearchScraper.search(QUERY, { engine: "Bing" }))
})()

❓ Get Help

There are few ways to get help:

  1. Please post questions on Stack Overflow. You can open issues with questions, as long you add a link to your Stack Overflow question.
  2. For bug reports and feature requests, open issues. πŸ›
  3. For direct and quick help, you can use Codementor. πŸš€

πŸ“ Documentation

register(c)

Params

  • Object c: The SearchScraper options along with the name of the scraper.

Return

  • SearchScraper The instance of the scraper.

getScraper(name)

Params

  • String name: The name of the scraper

Return

  • SearchScraper The instance of the scraper.

configure(conf)

Params

  • Array conf: An array containing:

puppeteerGoogleScraper(term, options)

Scrape Google using Puppeteer

Params

  • String term: The term to search.
  • Object options: An object containing:
    • limit (Number): The limit of the results (default: 100)
    • headless (Boolean): Whether the browser should be headless or not.

Return

  • Promise A promise resolving with an array of elements containing:
    • title (String)
    • url (String)

πŸ˜‹ How to contribute

Have an idea? Found a bug? See how to contribute.

πŸ’– Support my projects

I open-source almost everything I can, and I try to reply to everyone needing help using these projects. Obviously, this takes time. You can integrate and use these projects in your applications for free! You can even change the source code and redistribute (even resell it).

However, if you get some profit from this or just want to encourage me to continue creating stuff, there are few ways you can do it:

  • Starring and sharing the projects you like πŸš€

  • Buy me a bookβ€”I love books! I will remember you after years if you buy me one. 😁 πŸ“–

  • PayPalβ€”You can make one-time donations via PayPal. I'll probably buy a coffee tea. 🍡

  • Support me on Patreonβ€”Set up a recurring monthly donation and you will get interesting news about what I'm doing (things that I don't share with everyone).

  • Bitcoinβ€”You can send me bitcoins at this address (or scanning the code below): 1P9BRsmazNQcuyTxEqveUsnf5CERdq35V6

Thanks! ❀️

πŸ“œ License

MIT © Ionică Bizău