/bioservices

Access to Biological Web Services from Python.

Primary LanguagePythonOtherNOASSERTION

BIOSERVICES: access to biological web services programmatically

Documentation Status https://raw.githubusercontent.com/cokelaer/bioservices/main/doc/_static/bioservices2_logo_256.png https://static.pepy.tech/personalized-badge/bioservices?period=month&units=international_system&left_color=black&right_color=orange&left_text=Downloads
Python_version_available:BioServices is tested for Python 3.7, 3.8, 3.9
Contributions:Please join https://github.com/cokelaer/bioservices
Issues:Please use https://github.com/cokelaer/bioservices/issues
How to cite:Cokelaer et al. BioServices: a common Python package to access biological Web Services programmatically Bioinformatics (2013) 29 (24): 3241-3242
Documentation:RTD documentation.

Bioservices is a Python package that provides access to many Bioinformatices Web Services (e.g., UniProt) and a framework to easily implement Web Services wrappers (based on WSDL/SOAP or REST protocols).

The primary goal of BioServices is to use Python as a glue language to provide a programmatic access to several Bioinformatics Web Services. By doing so, elaboration of new applications that combine several of the wrapped Web Services is fostered.

One of the main philosophy of BioServices is to make use of the existing biological databases (not to re-invent new databases) and to alleviates the needs for expertise in Web Services for the developers/users.

BioServices provides access to about 40 Web Services.

Contributors

Maintaining BioServices would not have been possible without users and contributors. Each contribution has been an encouragement to pursue this project. Thanks to all:

https://contrib.rocks/image?repo=cokelaer/bioservices

Quick example

Here is a small example using the UniProt Web Service to search for the zap70 specy in human organism:

>>> from bioservices import UniProt
>>> u = UniProt(verbose=False)
>>> data = u.search("zap70+and+taxonomy_id:9606", frmt="tsv", limit=3,
...                 columns="id,length,accession, gene_names")
>>> print(data)
Entry name   Length  Entry   Gene names
ZAP70_HUMAN  619     P43403  ZAP70 SRK
B4E0E2_HUMAN 185     B4E0E2
RHOH_HUMAN   191     Q15669  RHOH ARHH TTF

Note

major changes of UniProt API changed all columns names in June 2022. The code above is valid for bioservices versions >1.10. Earlier version used:

>>> data = u.search("zap70+and+taxonomy:9606", frmt="tab", limit=3,
...                 columns="entry name,length,id, genes")

Note that columns names have changed, the frmt was changed from tab to tsv and taxonomy is now taxonomy_id. Names correspondences can be found in:

u._legacy_names

More examples and tutorials are available in the On-line documentation

Current services

Here is the list of services available and their testing status.

Service CI testing
arrayexpress
bigg
biocontainers
biodbnet
biogrid
biomart
biomodels
chebi
chembl
cog
dbfetch
ena
ensembl
eutils
eva
hgnc
intact_complex
kegg
muscle
mygeneinfo
ncbiblast
omicsdi
omnipath
panther
pathwaycommons
pdb
pdbe
pfam
pride
psicquic
pubchem
quickgo
reactome
rhea
rnaseq_ebi
seqret
unichem
uniprot
wikipathway

Note

Contributions to implement new wrappers are more than welcome. See BioServices github page to join the development, and the Developer guide on how to implement new wrappers.

Bioservices command

In version 1.8.2, we included a bioservices command. For now it has only one subcommand to download a NCBI accession number and possibly it genbank or GFF file (if available):

bioservices download-accession --accession K01711.1 --with-gbk

Changelog

Version Description
1.10.3
  • Update pdb service to use v2 API
  • remove biocarta (website not accesible anymore)
1.10.2
  • Fix #226 and applied PR from Fix from @GianArauz cokelaer#232 about UniProt error
  • Update MANIFEST to fix #232
1.10.1
  • allow command line to download genbank and GFF
  • update pride module to use new PRIDE API (July 2022)
  • Fixed KEGG bug #225
1.10.0
  • Update uniprot to use the new API (june 2022)
1.9.0
  • Update unichem to reflect new API
1.8.4
  • biomodels. Fix #208
  • KEGG: fixed #204 #202 and #203
1.8.3
  • Eutils: remove warning due to unreachable URL. Set REST as attribute rather and inheritance.
  • NEW biocontainers module
  • KEGG: add save_pathway method. Fix parsing of structure/pdb entry
  • remove deprecated function from Reactome
1.8.2
  • Fix suds package in code and requirements
1.8.1
  • Integrated a change made in KEGG service (DEFINITON was changed to ORG_CODE)
  • for developers: applied black on all modules
  • switch suds-jurko to new suds community
1.8.0
  • add main standalone application.
  • moved chemspider and clinvitae to the attic
  • removed picr service, not active anymore
1.4.X
  • NEW RNAseq from EBI in rnaseq_ebi module
  • Replaced deprecated HGNC with the official web service from genenames.org
  • Fully updated EUtils since WSDL is now down; implementation uses REST now.
  • Removed the apps/taxonomy module now part of http://github.com/biokit.
1.3.X
  • CACHE files are now stored in a general directory in the home
  • New REST class to use requests package instead of urllib2.
  • Creation of a global configuration file in .config/bioservice/bioservices.cfg
  • NEW services: Reactome, Readseq, Ensembl, EUtils
1.2.X
  • NEW services: BioDBnet, BioDBNet, MUSCLE, PathwayCommons, GeneProf
1.1.X
  • NEW services: biocarta, pfam, ChEBI, UniChem
1.0.0:
  • first stable release
0.9.X:
  • NEW services: BioModels, Kegg, Reactome, Chembl, PICR, QuickGO, Rhea, UniProt,WSDbfetch, NCBIblast, PSICQUIC, Wikipath