Code used to create PLAbDab available?

Question

Code used to create PLAbDab available?

Closed this issue 5 months ago · 4 comments

shatz01 commented 5 months ago

Hi first off, thanks for the amazing database!

It seems like this repo only contains code to query the already made database right?

Is the code used to create this from querying NCBI via Entrez available anywhere?

Thanks!

shatz01 commented 5 months ago

?

😕1

Answer 1 · 2024-04-08T13:11:20.000Z

Hi Daniel,

Thank you for your kind words about PLAbDab.

You are right, this repo only contains the code to query the database.

The code used to create the database is not currently publicly available. However, to extract data from the NCBI database using Entrez you can use the following code:

from Bio import Entrez

plabdab_ID = "AKW39254"

with Entrez.efetch(db="protein", id=plabdab_ID, rettype="gb",retmode="xml") as handle:
    entries = Entrez.read(handle)

sequence = entries[0]["GBSeq_sequence"].upper()

I hope that helps!!

All the best,

Brennan

Answer 2 · 2024-04-08T13:14:57.000Z

Gotcha, thanks! A bit sad that the code to create the database is not public, but its ok.

What is plabdab_ID? Does it literally correspond to the same plabdab database that we can query using this repo?

Answer 3 · 2024-04-08T13:26:32.000Z

Hi Daniel,

For entries scraped from the NCBI database, the plabdab ID will correspond to the LOCUS code used in NCBI (or a combination of them if the heavy and light chains come from a different NCBI entry). For example, the entry with plabdab ID QTW11010 was scraped from here: https://www.ncbi.nlm.nih.gov/protein/QTW11010.

All the best,

Brennan