/import.io.scraper

Automate website data scraping with ruby

Primary LanguageRuby

import.io ruby scraper

This is a script that uses the JSON config file generated by import.io.app to automate website data scraping with ruby.

Usage

Prepare import.io config

  1. Setup your extractor
  2. Save the extractor configuration to a JSON file
  3. Scrape

Scrape

require 'importer'

config = JSON.parse(File.read('my_import_io_extractor_config.json'))
url_set = [
  "http://www.example.com/search",
  "http://www.example.com/search&page=2",
  "http://www.example.com/search&page=3",
  "http://www.example.com/search&page=4",
]
date = Time.new().strftime('%Y-%m-%d')
Importer.scrape config: config, url_set: url_set, write_to: "com.example.www_#{date}.csv"