This is a script that uses the JSON config file generated by import.io.app to automate website data scraping with ruby.
- Setup your extractor
- Save the extractor configuration to a JSON file
- Scrape
require 'importer'
config = JSON.parse(File.read('my_import_io_extractor_config.json'))
url_set = [
"http://www.example.com/search",
"http://www.example.com/search&page=2",
"http://www.example.com/search&page=3",
"http://www.example.com/search&page=4",
]
date = Time.new().strftime('%Y-%m-%d')
Importer.scrape config: config, url_set: url_set, write_to: "com.example.www_#{date}.csv"