IEDB/arborist

Missing species in the old protein tree test build

Closed this issue · 3 comments

dmx2 commented

Examples:

Gallus gallus
Fagopyrum tataricum
Hepatitis C virus
Alnus glutinosa

Likely to do with the way molecule-tree.owl file is built fromnanobot.db protein_tree_old table.

I don't think the conversion from "protein_tree_old" to molecule-tree.owl is the problem. On arborist-dev.lji.org in /mnt/data/arborist/build/arborist/nanobot.db for "protein_tree_old", I see a parent node:

sqlite3 build/arborist/nanobot.db "SELECT * FROM protein_tree_old WHERE subject = 'iedb-protein:9031'"

but I don't see any child nodes, which should be returned by:

sqlite3 build/arborist/nanobot.db "SELECT * FROM protein_tree_old WHERE object = 'iedb-protein:9031'"

So I think the necessary rows are just missing from "protein_tree_old".

I would also expect to see rows for particular chicken proteins in "protein_tree_old", but I don't:

sqlite3 build/arborist/nanobot.db "SELECT * FROM protein_tree_old WHERE subject = 'UP:P01012'"

In contrast I get results for human insulin:

sqlite3 build/arborist/nanobot.db "SELECT * FROM protein_tree_old WHERE subject = 'UP:P01308'"
dmx2 commented

Fixed with b1fc87e