MATCH (gs:GeneSymbol)
WHEREsize(gs.sid) =1SETgs:OmitLength
skip gene symbols that are english words
match gene symbols against word list to exclude symbols that are common words
set an additional label to filter them
MATCH (gs:GeneSymbol), (w:Word)
WHEREtoLower(gs.sid) =toLower(w.value)
ANDw.match11=TrueSETgs:OmitWord
run the text match
match gene symbols against :Fragment fulltext index
use MERGE to be able to rerun the query
CALLapoc.periodic.iterate(
"MATCH (gs:GeneSymbol) WHERE NOT gs:OmitWord AND NOT gs:OmitSpecialChar AND NOT gs:OmitLength RETURN gs",
"CALL db.index.fulltext.queryNodes('fragmentGeneSymbol', gs.sid) YIELD node, score MERGE (gs)<-[r:MENTIONS]-(node) SET r.score = score",
{batchSize:10,parallel:false,iterateList:true});
count number of gene symbols with MENTIONS relationship
MATCH (gs:GeneSymbol)<-[r:MENTIONS]-(:Fragment)
RETURNcount(r)