monarch-initiative/biolink-api

Text annotator data needs significant cleaning

Opened this issue · 2 comments

The Monarch UI 2.0 has a giant block of code to "clean" the annotations it receives from biolink.

I can't see a good reason why such a thing should be hard coded into the frontend. We should be doing this on the backend instead, or better yet, examining the root source of the issue.

This is of course a common thread I've found while examining the UI 2.0 code. So many things are hard coded exceptions and lists and mappings that seem to ultimately stem from upstream data quality issues.

I see there is a working version of the text annotator in the current preview of the new app.
At this point I would like to deprioritize moving any further with this and focus on the rest of the App.
We will be replacing the text annotator backend in the future and the node page should take priority now.

Just to be clear, none of these issues that I've been making on biolink and monarch-new are show-stoppers. I'm creating issues to track everything that I would like to see fixed for the app to be the best it can be.

In cases where I really needed the issue to be resolved before moving forward, I simply copied the "big blocks of code" from 2.0 into 3.0 (with simplifications where possible). Where I didn't need them, I just ignored them and left them out of 3.0. In both of these cases though, I put comments in the code with a link to a github issue tracking it.

In this particular case, the text annotator has been completed for about a month now and I've moved much beyond it, so this issue isn't holding anything up.

If an issue ever is completely blocking me, I will make explicit note of it, both in the issue and in Slack/email/etc.