Welcome. This project is in near-publication stage. Please follow this project for announcements or e-mail me (guhli007 at umn.edu) and ask to be mailed when the initial release is out.
This program takes genome files and output from various programs to create a graph database for querying genomic data across domains. This program is stable and database structure is well defined.
Feel free to ask questions or make suggestions.
Please see the manual for more information.
An Installation Video is available. It also contains some initial steps to getting the database pipeline prepared. The set-up is similar across all platforms. A quick-start PDF / PPT guide is currently being generated.
Please see the FAQ for additional information if you are having trouble.
Download and run appropriate programs to create the output. Please see the documentation to see what types of files are necessary.
File formats and attributes within are the most important factor in taking flat 2-dimensional files and generating a connected graph database. The manual details this, although you may take a look at the example files if you have additional questions. With ODG we have striven to make it flexible and work with the output of several programs. Many output files (especailly GFF and FASTA) can be converted to work with ODG relatively easily, as it is usually renaming a field. Many programs can do this, including nano on Linux and Mac OS X, and Notepad2 on Windows, although any text editor with Find/Replace functionality can work. Window's own "notepad" can work although it often struggles to open larger text files, such as GFF or FASTA files.
Download and extract one of the available releases.
Place files for each of your genomes in the data/ directory, one directory for each genome and genome version, or accession (strain). For Example:
data/
Arabidopsis_thaliana_10/
assembly.fasta
genes.gff3
proteins.fasta
ipscan_results.tsv
Oryza_sativa_2/
assembly.fasta
genes.gff3
proteins.fasta
ipscan_results.tsv
E_coli_1/
assembly.fasta
genes.gff3
proteins.fasta
ipscan_results.tsv
Example input files are found in the example-input-files subdirectory on this github. They include a configuration file. You may place the configuration file in your ODG installed directory and the data files in the data directory, and proceed with database generation (create-db.bat or create-db.sh).
Example output files (and input files, when necessary to generate output files) are provided here to give you an idea of what output you can expect from most command-line activities. The web interface is available in the Test Drive version of ODG, simple download it and run query-server.bat or query-server.sh and connect to http://localhost:6789/ on your computer once it has started.
Command-line queries are also available to run from the test-drive version of ODG using odg.sh or odg.bat as the initial command. Please see the manual for more information.
To begin the configuration program double click "odg-config.bat" and set your browser to http://localhost:33333 after the server has started, usually a few minutes after starting. To compile the database, once all intervening software has been run you can run "odg-generate.bat" and wait, usually a half hour to a few hours depending on the size of your dataset. To run the query server, run "odg-query.bat" and point your web browser to http://localhost:6789
To begin the configuration program run "./odg-config.sh" and set your browser to http://localhost:33333 after the server has started, usually a few minutes after starting. To compile the database, once all intervening software has been run you can run "./odg-generate.sh" and wait, usually a half hour to a few hours depending on the size of your dataset. To run the query server, run "./odg-query.sh" and point your web browser to http://localhost:6789
Please see the user guide, available here
ODG is offered as 2 releases, and should work on Windows, Mac, and *nix Systems. *nix systems are preferred, but ODG is developed and tested primarily on Windows.
Java 1.8 is required. You likely have Java installed on your system. If not you may download it at http://www.oracle.com/technetwork/java/javase/downloads/ ODG is expected to work with, but not tested on, OpenJDK versions equivalent to Java 1.8. ODG is memory-intensive, at least 8GB is recommended.
You can view our releases here.
This version contains a completed database built with publicly available data and compiled information for Arabidopsis thaliana, Glycine max, Medicago truncatula, and Phaseolus vulgaris. It is provided here to provide a way for you to test out the features of ODG and see if it is right for you.
This version contains no test data and is ready to be populated with your genomes and data of choice. ODG is now bundled as an installer. To download the installer, click on this link:
With version 1.1.0 ODG now comes as an installable file. If you have issues please check your Java version first (see above). odg_installer_1.1.0.jar
Copyright © 2017 Joseph Guhlin, Licensed under GPL v3 (see LICENSE file)