/jdx-api

API to convert job descriptions to JobSchema+ format.

Primary LanguagePythonApache License 2.0Apache-2.0

This application should be run along side competensor. The frontend to walk through the workflow is here, reference-app-ui.

How to use

  1. Install Docker per their instructions, https://docs.docker.com/install/

  2. Download this repo by using the following commands,

$ git clone https://github.com/brighthive/jdx-api.git
$ cd jdx-api
  1. To start the server use the following command,
docker-compose up
  1. At this point the server should spin up and will eventually report the IP address where you can access it, typically http://0.0.0.0:8000. You should also be able to access it at http://localhost:8000.

Setting up for Development

Mac specific

(If you are on mac and have trouble installing postgres related libraries, psycopg2, pgcli, try the following)

Step 1.

brew install openssl

Step 2.

export LIBRARY_PATH=$LIBRARY_PATH:/usr/local/opt/openssl/lib/

General instructions

As a quick overview, to set up a development environment you will need to clone the repository and setup a virtual environment with Pipenv.

The following provided instructions are for ubuntu.

  1. Ensure you have Python and PostgreSQL development libraries installed,
sudo apt-get install libpq-dev python-dev
  1. Install pipenv by following these instructions.

  2. Clone the repo,

$ git clone https://github.com/brighthive/jdx-api.git
$ cd jdx-api
  1. Install things for textract (This pipfile contains a fork of textract that works for this repo).

4a. Install the prereqs (https://textract.readthedocs.io/en/latest/installation.html)

apt-get install python-dev libxml2-dev libxslt1-dev antiword unrtf poppler-utils pstotext tesseract-ocr \
flac ffmpeg lame libmad0 libsox-fmt-mp3 sox libjpeg-dev swig libpulse-dev

4b. Install a fake dependency for textract. (Why do I need to do this? deanmalmgren/textract#178). Run the following command,

pipenv shell
curl https://raw.githubusercontent.com/OriHoch/textract/fake-pocketsphinx-for-swig-dependency/provision/fake-pocketsphinx.sh | bash -
exit
  1. Install the Python production and development dependencies,
$ pipenv install --dev

To run tests see the running test section.

To run the application see the how to use section.

Have fun developing!

Commonly used development commands

Please see /scripts/jdx-cli.sh for commonly used commands.

Running tests

To run the test suite first you must stand up a database,

$ docker-compose -f docker-compose-test.yml down && sudo docker-compose -f docker-compose-test.yml build && docker-compose -f docker-compose-test.yml up

Then run the tests with,

$ pipenv run pocha tests

API Reference

Swagger spec / OpenAPI Specification is located at https://app.swaggerhub.com/apis/loganripplinger/JDX-reference-backend-application-real

Output files

As users go through the workflow JDX API will produce a number of files.

These files along with API logs are saved within the /logs folder and subfolders.

To enable these files to be pushed to an S3 bucket edit the bucket name, access and secret keys within jdxapi/s3_config.py.

Related repositories