/neo4j-etl-components

Data import from relational databases to Neo4j.

Primary LanguageJavaGNU General Public License v3.0GPL-3.0

Travis CI Status Github release status contributor stats

Neo4j ETL Components

Data import from relational databases to Neo4j.

Download & Install

Download & unzip the latest neo4j-etl-components.zip. Once unzipped, download the proper JDBC Driver here and add it to the lib folder.

Examples of command usage:

Minimal command line
./bin/neo4j-etl { mysql | postgresql | oracle } export --url <url> --user <user> --password <pwd> \
 --destination $NEO4J_HOME/data/databases/graph.db/ --import-tool $NEO4J_HOME/bin  \
 --csv-directory /tmp/northwind
Full set of command line options
./bin/neo4j-etl { mysql | postgresql | oracle } export --url <url> --user <user> --password <pwd> \
--database <database> --destination $NEO4J_HOME/data/databases/graph.db/ \
--import-tool $NEO4J_HOME/bin --csv-directory /tmp  \
--options-file ./import-tool-options.json
--force --debug

For detailed usage see also the: tool documentation.

Quick Build Instructions

To run the command line tool, once you cloned the project, build the project locally.

From the root directory, run:

git clone https://github.com/neo4j-contrib/neo4j-etl-components
cd neo4j-etl-components
mvn clean package -DskipTests

Once that has been successfully executed, the scripts are located in the neo-etl-cli/bin directory.

License

This tool is licensed under the GPLv3.

Issues & Feedback & Contributions

We’re welcoming all contributions.

  • Please first raise an issue so that we can discuss your ideas and avoid duplicate work

  • Send your changes as a pull request

  • Make sure to provide tests with your changes

Prerequisites

Vendor JDBC Driver URL

MySql

http://dev.mysql.com/downloads/connector/j/

PostgreSql

https://jdbc.postgresql.org/download.html

Oracle

http://www.oracle.com/technetwork/database/features/jdbc/default-2280470.html

Tests

Northwind database scripts are downloaded from https://code.google.com/archive/p/northwindextended/downloads

Integration Tests

You can run the tests with

You will need a RDBMS user neo4j with password neo4j with admin privileges to run the tests.

To run the tests in AWS, you’ll need an AWS IAM user.

To run the tests using a local instance:

Grant all privileges to the user neo4j identified by password neo4j

MySQL:

CREATE USER 'neo4j'@'localhost' IDENTIFIED BY 'neo4j';
GRANT ALL PRIVILEGES ON *.* TO 'neo4j'@'localhost' WITH GRANT OPTION;
FLUSH PRIVILEGES;

PostgreSQL:

CREATE USER neo4j WITH SUPERUSER ENCRYPTED PASSWORD 'neo4j';

Oracle:

CREATE USER neo4j IDENTIFIED BY neo4j;
GRANT DBA TO neo4j;

For Oracle, you’ll need to add the Oracle Driver to your local Maven repository manually:

mvn install:install-file -Dfile={Path/to/your/ojdbc7.jar} \
      -DgroupId=com.oracle -DartifactId=ojdbc7 -Dversion=12.1.0.2 -Dpackaging=jar

Microsoft SQL:

Currently the integration tests use the default user "sa".

To run the tests locally:

mvn -P integration-test clean test

To run the tests using Vagrant:

mvn -P integration-test -D PLATFORM=vagrant clean test

To run the tests using AWS:

Note: You need to create AWS Keypair and have the credentials file created to do this

mvn -P integration-test -D PLATFORM=aws -D EC2_SSH_KEY=<name of your EC2 SSH key> clean test

Performance Tests

Set of tests that are part of the neo4j-etl-it module.

However, they are skipped usually when you run the integration-test target by default. You can run them separately as part of a test suite.

To run performance tests locally:

mvn -P performance-test clean dependency:copy-dependencies test -D failIfNoTests=false -D EC2_SSH_KEY=<name of your EC2 SSH key>

To run performance tests in AWS:

mvn -P performance-test clean dependency:copy-dependencies test -D PLATFORM=aws -D failIfNoTests=false -D EC2_SSH_KEY=<name of your EC2 SSH key>