Behind the scenes of our API, we need to issue an HTTP request modelled after the following:
GET http://www.gabipd.org/services/rest/mapman/bin?request=[{"agi":"At4g25530"}]
We expect a JSON response resembling this one:
[
{"request":
{"agi":"At4g25530"},
"result":[
{"code":"27.3.22",
"name":"RNA.regulation of transcription.HB,Homeobox transcription factor family",
"description":"no description",
"parent":
{"code":"27.3",
"name":"RNA.regulation of transcription",
"description":"no description",
"parent":
{"code":"27",
"name":"RNA",
"description":"no description",
"parent":null}}}]}]
This JSON response, while not conforming to the emergent AIP data schema, is perfectly valid JSON. So, we will go ahead and return it to consumers. However, we will normalize the input parameters to use the AIP "locus" parameter name instead of accepting a JSON object. We will implement a "generic" type API to accomplish these objectives.
Open up workshop_tutorial_api/generic_demo/main.py
in your text editor. You will see that the stubbed in search()
function has been replaced with a lot more text, which has all been commented out. We will gradually remove comments, testing along the way, to demonstrate key concepts in building a generic Data API for Araport.
# arg contains a dict with a single key:value
# locus is AGI identifier and is mandatory
# Return nothing if client didn't pass in a locus parameter
# if not ('locus' in arg):
# return
# Validate against a regular expression
# locus = arg['locus']
# locus = locus.upper()
# p = re.compile('AT[1-5MC]G[0-9]{5,5}', re.IGNORECASE)
# if not p.search(locus):
# return
# param = '[{%22agi%22:%22' + locus + '%22}]'
# r = requests.get('http://www.gabipd.org/services/rest/mapman/bin?request=' + param)
# print r.headers['Content-Type']
# print r.text
# if r.ok:
# return r.headers['Content-Type'], r.content
# else:
# return 'text/plaintext; charset=UTF-8', 'An error occurred on the remote server'
Let's implement handling of the 'locus' parameter. The ADAMA service that powers our community APIs will eventually take on validation of parameters but for now, if you want robust web services you need to add in some error handling.
Uncomment the following code blocks:
# Return nothing if client didn't pass in a locus parameter
if not ('locus' in arg):
return
# Validate against a regular expression
locus = arg['locus']
locus = locus.upper()
p = re.compile('AT[1-5MC]G[0-9]{5,5}', re.IGNORECASE)
if not p.search(locus):
return
Now, let's go test it out in a Python environment. Remember that you should be in a virtual environment with the requests
module available because we imported it at the start of main.py
.
Python 2.7.6 (default, Sep 9 2014, 15:04:36)
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.39)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import main
>>> main.search({'locus':'AT4G25530'})
AT4G25530
>>> main.search()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: search() takes exactly 1 argument (0 given)
>>> main.search({'foo':'AT4G25530'})
>>> main.search({'locus':'bar'})
>>>
Notable observations:
- We invoke the main's search method with a python dict. This emulates the way the ADAMA master server sends parameters to your code
- Our code validation first checks for the existence of the locus key and returns if not present
- The code validation next checks for a valid locus name via regex and returns if not found
- If we fail to pass in a dict to
search()
, there is also a failure response
We're ready to wire up an HTTP GET request to the MapMan web service. A simple form of the Python requests
module is demonstrated here - if you're not familiar with it you may want to check out its documentation in detail.
Uncomment the following code blocks:
# Construct an access URL for the MapMan API
param = '[{%22agi%22:%22' + locus + '%22}]'
r = requests.get('http://www.gabipd.org/services/rest/mapman/bin?request=' + param)
# Debug: Print the headers and text response
print r.headers['Content-Type']
print r.text
Test out the code in your Python environment. If you have the previous Python interpreter still active, type reload(main)
instead of import main below.
>>> import main
>>> main.search({'locus':'AT4G25530'})
application/json
[{"request":{"agi":"AT4G25530"},"result":[{"code":"27.3.22","name":"RNA.regulation of transcription.HB,Homeobox transcription factor family","description":"no description","parent":{"code":"27.3","name":"RNA.regulation of transcription","description":"no description","parent":{"code":"27","name":"RNA","description":"no description","parent":null}}}]}]
We're now ready to return the response to the client. In a generic Araport API, we accomplish this by returning two obejcts: a valid content-type and the content of the remote server's HTTP response. You can also generate your own data to transmit to the client (for instance, GFF records or comma-separated value rows).
Re-comment and uncomment the following code blocks
#print r.headers['Content-Type']
#print r.text
if r.ok:
return r.headers['Content-Type'], r.content
else:
return 'text/plaintext; charset=UTF-8', 'An error occurred on the remote server'
Technically, for this use case we could just return the header provided by the remote service and its content, but we are being clever and checking for 200 OK. We return a plaintext error if we don't see see a server code of 200 in the response from the MapMan remote API.
We can test out the service locally in your Python interpreter now (an exercise left to the reader). When we are sure it's working, it's time to register the service under the Araport /community
API.
Every Araport service is described using a metadata file written in YAML format. Our service's metadata file is pasted in below:
---
description: "Returns MapMan bin information for a given AGI locus identifier using the generic type of Araport web service"
main_module: generic_demo/main.py
name: generic_mapman_bin_by_locus
type: generic
url: "http://www.gabipd.org/"
version: 0.1
whitelist:
- www.gabipd.org
- name is a lexically unique, lowercase-only identifier for the service
- version is the version of the Araport service
- description is a human-readable description
- main_module defines where, relative to the root of the Git repo, we find the main file for the service
- type is one of the supported Araport service types
- url is optional in this case, but points to the top level URL of the service
- whitelist is a set of addresses that our service can communicate with. If you forget to specify this, your service will fail to work
If you want to save your current work, please do the following in your local workshop_tutorial_api
directory
git add .
git commit -m "In progress!"
Checkout the next branch to migrate to a fully functional code repo that we can be sure will work as a web services. Follow along with the instructions in the README.md file under that branch.
git checkout "tutorial/2"