/ETLToolSuite-EntityGenerator

This tool will generate Entity Files that can be used to load into Datasources.

Primary LanguageJavaApache License 2.0Apache-2.0

ETLToolSuite-EntityGenerator

This tool will generate Entity Files that can be used to load into Datasources.

Author: Thomas DeSain


Example Prerequistes


Admin rights to machine hosting docker
Quick Start docker stack
ETL Client Docker
Data file and Mapping file from MappingGenerator Example


Steps:
This example was validated on a Mac and AMI Linux terminals.
To see it in action follow here to load the NHANES dataset

  1. Open bash connection to your ETL Client Docker
    docker exec -e COLUMNS="`tput cols`" -e LINES="`tput lines`" -ti etl-client bash
  2. use git to clone this project to a dir of your choosing. git clone https://github.com/hms-dbmi/ETLToolSuite-EntityGenerator
  3. Navigate to root directory:
    cd ETLToolSuite-EntityGenerator
  4. Make a directory to store your data:
    mkdir data
  5. Make a directory to store your mapping file:
    mkdir mappings
  6. Make a directory to store your processed data files:
    mkdir completed
  7. Copy the data file and mapping file generated from the MappingGenerator Example
    ( The BASE_DIR will be the location of the MappingGenerator git project you cloned in the Mapping generator example ) :
    cp ../ETLToolSuite-MappingGenerator/example/Asthma_Misior_GSE13168.txt data/
    cp ../ETLToolSuite-MappingGenerator/example/mapping.csv mappings/mapping.csv
    cp ../ETLToolSuite-MappingGenerator/example/mapping.csv.patient mappings/PatientMapping.csv
  8. execute following code block to generate your I2B2 entities:
    java -jar EntityGenerator.jar -jobtype CSVToI2b2TM
  9. Navigate to completed directory cd completed
  10. list the directory's contents.
    ls -la
  11. Once the job has completed processing this folder will contain the following files:
    I2B2.csv
    ConceptDimension.csv
    ObservationFact.csv
    ConceptCounts.csv TableAccess.csv
    PatientDimension.csv
    PatientTrial.csv
    PatientMapping.csv
  12. exit exit
  13. If your data files exist you can now move on to loading the entity files into your database by following the readme here.