hubmapconsortium/ccf-asct-reporter

missing terms in graph json export

bherr2 opened this issue · 2 comments

I am a 4th-yearth year Heriot-Watt student and I am working on a biomedical spatial data integration problem, using Neo4j, as part of my dissertation project. This is in relation to the Gut Cell Atlas project (https://www.ed.ac.uk/comparative-pathology/the-gut-cell-atlas-project/gut-cell-atlas-project).

I exported the graph data format and set up a Neo4j instance of the Large and Small intestine database. I started querying the database and realised that some connections are missing in comparison with the reporter tool. For example, if you export the graph data for large and small intestine combined, the node id 69 - serosa (connected to terminal ileum - id 59 ) should have connections going to 3 cell types (endothelial - id 365, Ontology ID CL:0000115; fibroblast - id 367, Ontology ID CL:0000057 and mesothelial - id 384 Ontology ID CL:0000077. However, there is only one entry (see attached) in the generated JSON file, in the "edges" object, which connects the id's 59 and 69.

Is this deliberate and I misunderstood how I should set up the Neo4j database or there is potentially a bug in the code used to generate the graph data file?

Are you able to help or should I get in touch with somebody else in relation to this matter?

Best wishes,
Lukasz Szmulkiewicz

I have an email with updates from the Gut Cell Atlas team Mark Arends & Derek Houghton and I am working with John Hickey to get these all updated in the 5th release Large and Small intestine tables. there are also issues flagged in the CCF -Tools Validation from David Osumi-Sutherland's group. Plus more cell types in the OMAP that are not covered.

Ultimately, it looks like the issues are in the tables themselves that would need to be ironed out before we can explore further if there is a problem in the code. Closing for now.