EnquistLab/RTNRS

Improper matching when resolved name should have "-" in specific epithet

Closed this issue · 3 comments

Examples:
TNRS("Pachycereus_pecten_aboriginum") returns "Pachycereus", should be "Pachycereus pecten-aboriginum"
TNRS("Cephalocereus_columna_trajani") returns "Cephalocereus columna" (not a published name), should be "Cephalocereus columna-trajani"

Interesting, it looks like "Cephalocereus columna-trajani" is correctly returned, because "Cephalocereus columna" is listed as a synonym. @ojalaquellueva perhaps the solution is to include similar synonyms for all hyphenated names? Or to change the nomenclature to not do anything as stupid as using non-letter characters in a name. The first solution seems easier.

@mdpillet @bmaitner:

Back in the day, we added silent conversion of underscores to whitespace as a courtesy for users of phylomatic (and other phylogenetic applications) who had gotten into the habit of submitting names formatted for these applications directly to the TNRS without restoring the whitespaces. The original sin was committed by the developers who should have treated taxonomic names as variables (which can contain whitespace), not Linux filesystem objects (which cannot contain whitespace, unless you set the system-wide IFS to something else).

After underscore removal, "Pachycereus_pecten_aboriginum" gets converted by the TNRS to "Pachycereus pecten aboriginum". The latter is then correctly resolved to "Pachycereus" because there is no Pachycereus species "pecten", much less infraspecific taxon "pecten var. aboriginum" or "pecten subsp. aboriginum". For the name to be correctly resolved, it should be formatted as "Pachycereus_pecten-aboriginum". Or, better still, "Pachycereus pecten-aboriginum".

"Cephalocereus_columna_trajani" is correctly resolved to "Cephalocereus columna" for the same reason (although it is treated as a synonym of and subsequently resolved to accepted name "Cephalocereus columna-trajani"). To be directly matched to "Cephalocereus columna-trajani", the submitted name should be formatted as "Cephalocereus_columna-trajani" or "Cephalocereus columna-trajani".

Closed as expected behavior.