
Python wrapper for ENCODE (Encyclopedia of DNA Elements) API and tools for working with the data.

Primary LanguagePythonMIT LicenseMIT


Python wrapper for ENCODE (Encyclopedia of DNA Elements) API and (eventually) tools for working with the data.


Clone this repository and change into the repo directory. Run

python setup.py install


python setup.py develop

if you want to work on the source code.


Starting with an ENCODE identifier

If you have an ENCODE identifier such as a

you can query the ENCODE API as follows:

import pyencodetools as pet
res = pet.fetch('ENCBS000AAA')

fetch can take as input different identifiers and query the ENCODE API. fetch returns an ENCODERecord object with attributes that correspond to the keys of the JSON response dict from ENCODE. You can also instantiate an object directly:

res = pet.ENCODERecord('56e94f2b-25ac-4c58-9828-f63b66220999')


You can search the ENCODE API as follows:

search_res = pet.search('bone chip')

search_res will be a list of ENCODERecord objects that match the search term. There is a default limit on the numer of results returned that can be changed:

search_res = pet.search('bone chip', limit=10)

Advanced Search

If you want to search using ENCODE's search parameters, begin your search term with '?'. For instance, the following query returns all DNAse experiments in human (up to a limit of 1,000 experiments):

search_res = pet.search('?type=experiment&assay_term_name=DNase-seq&replicates.library.biosample.donor.organism.scientific_name=Homo%20sapiens', limit=1000)

You can see how these types of queries are constructed by looking at the ENCODE API documentation or by going to one of the ENCODE data pages (such as the Assays page and applying some of the filters on the left side of the page. As you apply the filters, the URL will update with the appropriate search term.


If you want to work on the source code, start by initializing and updating the ENCODE submodule:

git submodule init
git submodule update


My overall goal for this package is to provide a nice Python interface for the ENCODE API. I have created a general class EncodeRecord that wraps results from GET requests to the API. My plan is to add parsers for various types of data returned from ENCODE to make the information more useful. For instance, rather than just setting a "culture_harvest_date" as a string, I can convert is to a datetime object that is more useful within Python.