Dealing with changes in the history file
oganm opened this issue · 0 comments
It seems like genes_history file can gain or lose events retroactively. This means certain paths to new genes occasionally break which means certain genes can become unavailable between different versions and there would be no way to identify what happened without looking at a past version of the file. Of course such an archive of gene_history files do not exist, to my understanding the whole point of gene_history file was to be such an archive anyway. Edits into the past do break this idea.
This doesn't happen frequently as far as I can see. Over the course of two months I lost 1 connection between a mouse gene in original homologene and its future version. I didn't really dig too deep into this though.
The major implication is that using updateHomologene
function with the baseline homologeneData2 included in CRAN, which is the default parameter, you may get a different file compared to what I generate here since I always start from the original homologene. The default was set to homologeneData2 because it is significantly faster to update from a later starting point. No sure if I should change this default in light of this.