/code-gov-api

API powering the code.gov source code harvester

Primary LanguageJavaScriptOtherNOASSERTION

CircleCI Maintainability Test Coverage Issue Count

Code.gov API - Unlocking the potential of the Federal Government’s software

Our backend API. This project is an Express.js application backed by Elasticsearch. Its primary function is to index and make America's source code discoverable and searchable.

Introduction

What is Code.gov?

Code.gov is a website promoting good practices in code development, collaboration, and reuse across the U.S. Federal Government. Code.gov will provide tools and guidance to help agencies implement the Federal Source Code Policy. It will include an inventory of the government's custom code to promote reuse between agencies. Code.gov will also provide tools to help government and the public collaborate on open source projects.

Click to show more details

Looking for more general information about Code.gov and all of its projects? We have a repo for that! code-gov is the main place to find out more general information about Code.gov as a platform and program.

If you have any general feedback or do not know where to place an particular issue, please feel free to use code-gov to create new issues.

Installation

Please install the following dependencies before running this project:

Once node is installed, install the local npm dependencies

cd code-gov-api && npm install

Running

Environment Variables

Before running any of the commands included in the package.json file there are some environment variables that need to be set:

  • NODE_ENV: The node environment the project is running under. Valid environments are:

    • prodcution or prod
    • staging or stag
    • development or dev
  • LOGGER_LEVEL: The output level of all the logs produced by the application. This extends to the Elasticsearch library. Defaults to info.

    Click for details on logger levels

    We use Bunyan for our logging. More info on logger levels can be found at https://github.com/trentm/node-bunyan#levels

  • NEW_RELIC_KEY (optional) - Your New Relic key. You will need a New Relic account to get one. For more inforamation visit the New Relic docs.

  • USE_HSTS: Sets the use of HTTP Strict Transport Security. The default value depends on the value set for NODE_ENV. This variable is set to true if in production or false if not in production.

  • HSTS_MAX_AGE: A HSTS required directive. For more information on what it is used for please visit https://tools.ietf.org/html/rfc6797#section-6.1.1. This value defaults to 31536000 milliseconds

  • HSTS_PRELOAD: Whether or not to use the HSTS pre-loaded lists. Defaults to false. More information on HSTS pre-loaded lists can be found at https://tools.ietf.org/html/rfc6797#section-12.3.

  • PORT: Port to be used by the API. Defaults to 3000.

  • ES_HOST: URL for the Elasticsearch host to be used by the API and harvesting process. This URL should also contain the user and password needed to use the Elasticsearch service. Defaults to http://elastic:changeme@localhost:9200

    Click for more details on Elasticsearch Auth

    Elasticsearch has a built in REST API with its own internal security features. The user elastic with the password changeme is the default super user. This should not be used this way in a production environment.

    For more information about how to configure the authentication for Elasticsearch click here.

  • GITHUB_AUTH_TYPE: The type of authentication mechanism to use with the Github API. Defaults to token.

    Click here for more information on Github Authentication Types

    There are a couple of different ways you can interact with the Github API. The more common ones are:

    • basic authentication
    • OAuth2 token based authentication
    • OAuth2 key/secret based authentication

    For more information please click here

  • GITHUB_TOKEN: The token to use for Github API access. This variable has no default value and needs to be provided by you. This token can be obtained in your Github profile settings. For more info please click here.

Data Harvesting

This project uses Elasticsearch to store code repository metadata. As such, it is necessary to run an indexing process which will populate the necesary indexes in Elasticsearch.

Make sure that Elasticsearch is running and is accessible.

Click here for more info on installing and running Elasticsearch

To install Elasticsearch on your machine please follow the instructions found here.

We have found that using Elasticsearch within a Docker container is one of the simplest ways to get up and running. We have included a Docker compose file in this project that can help you get on your way.

Please take a look at the Getting Started and Set up Elasticsearch sections in the Elastic documentaion.

Once verified that Elasticsearch is up execute:

npm run index

This will start the harvesting and indexing process. Once this process is finished all data should be available for the API.

Starting the API

After the indexing process runs, you can fire up the server by running:

npm start

The API should now be accessible via the browser (or curl) at http://localhost:3000/api/.

Click for the cUrl command
curl http://localhost:3000/api/

Docker

For more detailed documentation on Docker and its components please visit their documentation site.

Build

To run a container you first have to build an image. To do so you can execute

docker build -t <name_and_tag_for_your_image> .
Click for example

For us, Code.gov, the command would be:

docker build -t codegov/code-gov-api .

To verify that the image was created you can execute

docker images

Look for the name_and_tag_for_your_image that you used to build the image.

docker-build

Run a container

To create and run a container execute:

docker run -p 3000:3000 codegov/code-gov-api

If you want the container to run in the background (detached) pass the -d flag to the docker run command.

Eg:

docker run -d -p 3000:3000 codegov/code-gov-api

To attach the project's source directory to the containers volume execute docker run -d -p 3000:3000 -v <path_to_project>:/usr/src/app codegov/code-gov-api

Eg.

docker run -d -p 3000:3000 -v /home/user/code-gov-api:/usr/src/app codegov/code-gov-api

For more information on how to use Docker volumes take a look at:

Container Env

The code-gov-api container accepts a number of environment variables. They are the same variables found here.

Click here for an example
docker run -p 3000:3000 \
  -e NODE_ENV=dev \
  -e ES_HOST="http://elastic:changeme@localhost:9200" \
  codegov/code-gov-api

Docker compose

Docker compose lets you recreate a complete environment for the code.gov API. The docker-compose.yml file lets us define how these services are stood up, how they relate to each other, and manages other low level things. For more detailed information on Docker Compose take a look at https://docs.docker.com/compose/.

To stand up a Code.gov API environment execute from the root of the project:

docker compose up

This command will build a new code-gov-api image, download an Elasticsearch image, and will run all containers in the correct order. You will see the output of each container in your terminal.

Once everything is up and running you can access the API in your browser at: http://localhost:3001/api. If you only want to build the code-gov-api image you can execute docker-compose build.

Contributing

Here’s how you can help contribute to code.gov API:

Questions

If you have questions, please feel free to open an issue here or send us an email at code@gsa.gov.

Public domain

As stated in our contributing document:

This project is in the worldwide public domain (in the public domain within the United States, and copyright and related rights in the work worldwide are waived through the CC0 1.0 Universal public domain dedication).

All contributions to this project will be released under the CC0-1.0 dedication. By submitting a pull request, you are agreeing to comply with this waiver of copyright interest.