BaranziniLab/KG_RAG

Disease_ With_ Relationship_ To_ genes.pickle file

Closed this issue · 5 comments

Thanks for your work. What data does the Disease_ With_ Relationship_ To_ genes.pickle file contain? And how to create this file?

Hi @xuzhaoyang-svg
disease_with_relation_to_genes.pickle holds the names of diseases whose vectors are computed and stored in vectorDB. Here, we only chose those disease names that have a relationship with Genes in the biomedical knowledge graph called SPOKE. Hence the name disease_with_relation_to_genes.
This file was created by running a CYPHER query against the SPOKE graph (in neo4j):

MATCH(d:Disease)-[r:ASSOCIATES_DaG]->(g:Gene)
RETURN d.name AS d_name

Let me know if this answers your question.

Store the query results obtained from the CYPHER query in . pickle file format by calculating vectors.

It is stored in pickle file format 'for computing vectors' NOT 'by calculating vectors'. Let me know if that clarifies your question.

Thank you, I understand now.

Awesome! I am closing this issue then.