/importSQL

A configurable and re-usable python script to import data from an import.io extractor into an SQL database

Primary LanguagePython

importSQL

A configurable and re-usable python script to import data from an import.io extractor into an SQL database

Dependencies

You will need to install the following:

  1. python
  2. python-mysqldb (used as an sql client only)
  3. A running SQL server (SQLlite, MySQL etc...)

Running it

To run it you have a few options, but firstly you need to have your table and database ready

Run the following commands to set up the demo table:

mysql>

CREATE DATABASE ILOVEOWLS;
USE ILOVEOWLS;

CREATE TABLE OMFGowls
(
field_image varchar(255),
field_name varchar(255),
field_price varchar(255),
field_size varchar(255),
field_colour varchar(255)
);

Using a config Script

  • Using this you can set up a configuration that you can set up once and re-use to get the latest data into yours database table

  • Make sure you know your import.io user credentials

  • Create a config.json file that follows this pattern:

{
	// Config for import.io
	"sourceUUID": "94cdc938-c24e-42db-b94f-3fb852c450a9",
	"inputUrl": "http://owlkingdom.com",
	"ioUserID": "[your User ID]",
	"ioAPIKey": "[your API key]",

	// Config for mysql
	"table": "OMFGowls",
	"database": "ILOVEOWLS",
	"host": "localhost",
	"port": 3306,
	"username": "root",
	"password": "root",
 
	"mapping": {
		"image": "field_image",
		"name": "field_name",
		"price": "field_price",
		"size": "field_size",
		"colour": "field_colour"
	}
}
  • Put it in the same directory as your importSQL script.
  • RUN IT! importSQL [optional:-U [sql username] -P [sql password] -u [io user ID] -p [io API key]]

This json file above will grab the owls from Owl Kingdom and put them into your SQL table

mapping

This mapping field defines the mapping between the column names in import.io and the column names in your MySQL database

"mapping": {
   	// import.io	// MySQL
	"image": 	"field_image",
	"name": 	"field_name",
	"price": 	"field_price",
	"size": 	"field_size",
	"colour": 	"field_colour"
}

Getting Data from Crawlers

To get all the data from your crawl from import.io into your SQL database, you can use the -c option to turn on crawler mode. This will get the data from the last crawl snapshot and not directly query the crawler using an input url.

(if you have settings in a config file, they will be loaded, but overwritten by anything you supply on the commandline)

importSQL -c -s "your crawler guid" [optional:-U [sql username] -P [sql password] -u [io user ID] -p [io API key]]

Using commandline options

  • When using just commandline options, be aware the the script will assume that the column names from import.io match the columns names in mysql

Here are the list of commandline options you can use:

  • -U mysql username (default: root)

  • -P mysql password (default: root)

  • -H mysql host name (default: localhost)

  • -E mysql port number (default: 3306)

  • -t mysql table name

  • -d mysql database name

  • -u import.io userID

  • -p import.io APIKey

  • -s source UUID

  • -i input url for data source

  • -c flag to tell if the source you want data from is an uploaded crawl