Should bivalirudin and bivalirudin trifluoroacetate be cliqued together?
Opened this issue · 0 comments
gaurav commented
We currently consider them to be two separate cliques, both of which have the preferred label "bivalirudin" (https://name-resolution-sri.renci.org/lookup?string=bivalirudin&autocomplete=false&highlighting=false&offset=0&limit=10)
Looking at these cliques (https://nodenormalization-sri.renci.org/1.5/get_normalized_nodes?curie=PUBCHEM.COMPOUND%3A19797045&curie=CHEBI%3A59173&curie=PUBCHEM.COMPOUND%3A78357798&conflate=true&drug_chemical_conflate=false&description=false&individual_types=false), we notice:
- The CHEBI:59173 "Bivalirudin" clique includes PUBCHEM.COMPOUND:16129704 "Bivalirudin" and most of its synonyms are "bivalirudin", although at least one identifier (MESH:C074619) has "" as a synonym.
- The PUBCHEM.COMPOUND:19797045 "Bivalirudin Trifluoacetate" clique ends up with a preferred label of "Bivalirudin" because it includes HMDB:HMDB0249283, whose label we prefer. However, this HMDB ID is no longer available on the website -- we're presumably getting this from PubChem.
- PUBCHEM.COMPOUND:78357798 "Bivalirudin (Trifluoroacetate)" shows up as a molecular mixture with a difference InChiKey from the other two cliques.
- These are not currently conflated if drug_chemical conflation is turned on.
- As far as I can tell, PUBCHEM.COMPOUND:78357798 and PUBCHEM.COMPOUND:19797045 are being split before the partials are generated.
Possible solutions:
- We could combine all three cliques into a single conflated clique. This is probably the simplest, fastest solution.
- We could try to split out a "Bivalirudin" clique and a "Bivalirudin Trifluoacetate" clique, but that would require some investigation as to why they aren't currently being combined (presumably because of those different InChiKeys).
- ???