PeTaL (Periodic Table of Life)

The Periodic Table of Life (PeTaL, pronounced petal) is a design tool aimed at allowing users to seemlesly move from ideas (from nature or other sources) to design. PeTaL is built around a graph database, machine learning tools, and a website.

Getting Started

These instructions will get you a copy of the project up and running on your local machine for development and testing purposes. See deployment for notes on how to deploy the project on a live system.

Clone PeTaL:

git clone https://github.com/nasa/PeTaL
cd PeTaL

First, install poetry:
curl -sSL https://raw.githubusercontent.com/python-poetry/poetry/master/get-poetry.py | python
Use poetry to install dependencies.
poetry install
pytorch is non-standard in the way it is packaged, and may need to be installed separately: poetry run pip install torch==1.5.1+cpu torchvision==0.6.1+cpu -f https://download.pytorch.org/whl/torch_stable.html, however an entry in pyproject.toml is still needed.

To run the PeTaL pipeline, use ./run config/default.json, or choose one of the other config files, such as airfoil_training.json, or mock_ml_config.json (an integrated test).
After doing so, the neo4j browser can be used to verify that a pipeline has run.
Also, pipelines must be manually cancelled, as they are designed to run as a server that receives incoming data constantly.
To run the PeTaL website, enter the site directory and run python manage.py runserver.

Steps to get started if you are running it from scratch:

./run config/mock_species_articles.json # Step 1
# Ctrl-C once the database has enough articles for testing
# This creates the ../data/index.html https://github.com/nasa/PeTaL/issues/28#issuecomment-649080792 
./run config/search.json # Step 2 
# Wait until index has been generated in PeTaL/data directory
cd site
python manage.py runserver

Note that running the PeTaL pipeline requires an actively running neo4j server, with URL and login info entered into a config file. The default configuration expects a neo4j bolt server running on 7687, with the username "neo4j" and password "life". These passwords and URLs are stored in a readable format, so it would be wise not to commit config files containing passwords to a production environment.

For specific instructions on extending PeTaL through pipeline modules, read this documentation.

Guides

Setting up the development environment

Deployment

Deploying PeTaL is as simple as setting up a neo4j server, running the pipeline backend to populate the database, and starting the website, potentially on a separate server.
Automating this with docker would be desirable.

Layout

The following section describes the directories of the PeTaL repository.

site
- The Django code for running the actual PeTaL website, which uses the neo4j database created by the pipeline.
modules
- Modules for PeTaL's data pipeline:
- mining
  - Holds taxon catalogers, article scrapers, and image downloaders.
- search
  - Holds multiple indexer modules, which, unsurprisingly, create the search index used by the PeTaL website.
- taxon_classifier
  - An image classifier at the taxon level
- airfoils
  - Experimental machine learning models related to airfoils and wing design
- directory_2.0
  - Parsing code for the "Directory 2.0" project, mostly outdated.
- mock
  - Modules intended for use in integrated testing of the data pipeline.
data
- Various data for PeTaL's data pipeline. Notably includes the search index and lexicon, images, and all machine learning modules, et cetera.
pipeline
- Backend code for the data pipeline, but not PeTaL-specific. Separated into another github repository. Some code is overly complicated, so don't feel afraid of contacting Lucas Saldyt if it breaks.
config
- Configurations for the data pipeline. Specifies a number of modules that should be run and settings related to running them, such as the maximum number of processes.
tests
- PeTaL's tests. Don't worry. Things will break in production no matter how many of these you write.
docs
- Documentation. If nothing gets added to this directory, you're doing it right.

Legacy Code

For Flask version of PeTaL as it existed in 2019, see the /legacy/ directory. Since the HTML/CSS/Javascript is similar to what is currently used, and some code is shared, this is kept close-by, potentially serving as a reference for future developers to build upon.

Authors

Vik Shyam - Principal Investigator
Herb Schilling - Data Science Lead
Calvin Robinson - Data Science/management Lead
Paht Juangphanich - Technical Support
Brandon Ruffridge - Technical Lead
Colleen Unsworth - Workflow Lead

PeTaL 1.0 - R

Nicholas Bense - Backend development
Allie Calvin - UI development and data collection
Victoria Kravets - Backend development

PeTaL 2.0

Angeera Naser - Front end development and intern team lead Summer 2018
Allie Calvin - UI development and data collection
Bishoy Boktor - Backend development
Brian Whiteaker - Text Classification
Connor Baumler - Backend development
Courtney Schiebout - Backend development
Isaias Reyes - 3D visualizations, interactions & full redesign
Jonathan Dowdall - Backend development
Kaylee Gabus - Front end development
Kei Kojima - Front end development
Lauren Friend - Backend development
Manju Johny - Backend development

PeTaL 3.0

Lucas Saldyt - Backend development, machine learning 01-2020 - 04-2020
Olakunle Akinpelu - Backend development 01-2020 - 06-2020
Kei Kojima - * *
Elliot Hill - * *
Benjamin Huynh - * *

hschilling/PeTaL

PeTaL (Periodic Table of Life)

Getting Started

Guides

Deployment

Layout

site

modules

mining

search

taxon_classifier

airfoils

directory_2.0

mock

data

pipeline

config

tests

docs

Legacy Code

Authors

PeTaL 1.0 - R

PeTaL 2.0

PeTaL 3.0