harvdev_utils Python package

Common Python functions and classes used by FlyBase developers at Harvard.

Installation

pip install -e git+https://github.com/FlyBase/harvdev-utils.git@master#egg=harvdev_utils

... and don't forget to the requirements for this module before use.

pip install -r requirements.txt

Documentation

Detailed information for some functions can be found in the Read the Docs documentation. This documentation does not include information regarding SQLAlchemy classes and functions (see below).

SQLAlchemy Classes

harvdev_utils contains two sets of SQLAlchemy classes for use with FlyBase Harvard's production and reporting databases. The class names correspond to tables within the Chado database and serve as an integral part of writing SQLAlchemy code.
To use these classes, include the appropriate imports at the top of your Python module:
- When using production or reporting individually (the classes share overlapping names, so only use this approach if production / reporting queries are not written together in the same module):
  - from harvdev_utils.production import *
  - from harvdev_utils.reporting import *
- When using production or reporting both within the same module:
  - from harvdev_utils import production as prod
  - from harvdev_utils import reporting as rep
  - Code can then be written by prefixing the classes as appropriate when calling them, e.g. prod.Feature, rep.Feature, rep.Pub, prod.Cvterm, etc.

SQLAlchemy Functions

harvdev_utils contains a set of commonly used Chado-SQLAlchemy functions:
- harvdev_utils.chado_functions.get_or_create
  - This function allows for values to be inserted into a specific Chado table. If the values already exist in the table, nothing is inserted. If the table uses rank, the rank value is automatically incremented and the values are always inserted.
  - Example import: from harvdev_utils.chado_functions import get_or_create
  - The function as defined in the module: def get_or_create(session, model, **kwargs)
  - Required fields:
    - session: Your SQLAlchemy session.
    - model: The model (aka table) where you'd like to insert data.
    - kwargs**: Values used to look up the appropriate row of a table to insert the data. Please see the example below.
  - The function always returns two variables: the first is an sql alchemy object (equivalent to a row in a table) and the second is True (if a new entry was created) or False (if an existing entry was retrieved).
  - Column values in the returned sql alchemy object can be accessed as such: uniquename = my_returned_object.uniquename
  - The debugging level is set to INFO by default and can be changed to DEBUG by using the following line in your script where appropriate:
    - logging.getLogger('harvdev_utils.chado_functions.get_or_create').setLevel(logging.DEBUG)

General Development

The dev_readme.md file contains instructions for regenerating SQLAlchemy classes.
Please use PEP8 whenever possible.
Docstrings should follow Google's style guides (Sphinx guide, additional example 1, additional example 2) and are used to generate Read the Docs documentation.
Tests should be written for each non-trivial function. Please see the tests folder for examples. We're using pytest via Travis CI for testing. Tests can be run locally with the command python -m pytest -v from the root directory of the repository (-v flag is optional).

Git Branching

Please branch from develop and merge back via pull request.
Merges from develop into master should coincide with a new release tag on master and a version increment.

Writing Documentation

The file docs/index.rst should be updated after a new module is added. The automodule command will automatically pull in information for specified modules once the code is pushed to GitHub. Please see the automodule documentation for help.

Example Development Workflow

Clone the repository and branch off develop.
Navigate to the directory harvdev_utils and use an existing folder (e.g. char_conversions) or create a new folder based on the goal of your module.
Create a single python file containing a function to be used. Feel free to add multiple functions to a single python file if you feel it's appropriate.
Be sure to add an entry to the __init__.py file in the folder where you're working.
- e.g. from .unicode_to_plain_text import unicode_to_plain_text
Update the file __init__.py in harvdev_utils and add your function to the list of default loaded functions. If the folder you are using does not exist at the top of the file, be sure to import it.
- e.g. from .char_conversions import *
Navigate to the tests folder and create a new sub-folder if you're not using a currently deployed folder (i.e. if you're using char_conversions, the folder already exists).
Create your test python file with the prefix test_ .
- e.g. test_sgml_to_plain_text.py
Tests can be run locally with python -m pytest from the root directory of the repository.
Edit the file docs/index.rst and be sure the folder that you're using is listed as an automodule.
- e.g. .. automodule:: harvdev_utils :members:
Additional text can be added to docs/index.rst as necessary. We can restructure this file if it becomes too long / complex.
Push your branch to GitHub and open a PR to develop when ready.
A subsequence merge to master and tagged release can be coordinated with other devs when appropriate.