AtlasOfLivingAustralia/galah-R

search_taxa not handling cases where a taxa is flagged as having a homonym issue

wcornwell opened this issue · 2 comments

Describe the bug
The tibble input is not being parsed properly by search_taxa to return the correct taxa_id in the case where there is a Homonym issue with one of the taxa. The help file suggests the tibble input is the right approach for this case but it's not working for me.

galah version
1.5.2

To Reproduce

search_taxa(tibble(genus="Acanthocladium", class="Equisetopsida"))

Expected behaviour
It should return the taxa_id for "Acanthocladium" which is the current name for a small daisy genus. The homonym issue is with a moss genus that was formerly (no longer) also called "Acanthocladium".

I expected including tibble(genus="Acanthocladium", class="Equisetopsida") would resolve the homonym issue and the correct taxa_id would be returned.

Instead of the daisy genus, search_taxa returns the taxa_id for Equisetopsida which leads to a large query that then crashes the API.
Screenshot 2023-07-07 at 3 57 58 pm

Apologies about the crashes, it took me a while to work out what was going on.

Additional context
This is related to #168 and #194

Thanks for reaching out. I was able to replicate this error and there does appear to be something wrong with how search_taxa() prioritises higher rank information supplied in a tibble.

At this point, I'm not sure why this is, but I first wanted to offer one solution:

Adding additional search information like authorship to your search can help return the correct results. On the ALA, the name authorship is attributed to F.Muell. Adding this information to your text search returns the correct result:

library(galah)
library(tibble)

search_taxa("Acanthocladium F.Muell")
#> # A tibble: 1 × 13
#>   search_term      scientific_name scientific_name_auth…¹ taxon_concept_id rank 
#>   <chr>            <chr>           <chr>                  <chr>            <chr>
#> 1 Acanthocladium … Acanthocladium  F.Muell.               https://id.biod… genus
#> # ℹ abbreviated name: ¹​scientific_name_authorship
#> # ℹ 8 more variables: match_type <chr>, kingdom <chr>, phylum <chr>,
#> #   class <chr>, order <chr>, family <chr>, genus <chr>, issues <chr>

And this seems to return an expected, nice, small number in a query too!

taxa <- search_taxa("Acanthocladium F.Muell")

galah_call() |>
  identify(taxa) |>
  atlas_counts()
#> # A tibble: 1 × 1
#>   count
#>   <int>
#> 1   128

Great! thanks for the workaround!