pyinat/pyinaturalist

Feature request: Support community ID votes calculation

synrg opened this issue · 0 comments

synrg commented

Feature description

Please support counting the community ID votes the way the website does. For background, please see this discussion:

https://forum.inaturalist.org/t/how-do-you-compute-cumulative-ids-from-api-response/7386

The website apparently doesn't make direct use of the counts accumulated on the observation, and instead counts votes from the identifications to come up with the "Cumulative IDs: # of #" numbers. To reliably provide those numbers, pyinaturalist would need to implement the same algorithm that the website does.

Use case

I would like to display a simplified observation display in a Dronefly bot command containing the essential parts of the observation display on iNaturalist website, including the "Community Taxon" sidebar, which includes "Cumulative IDs". In fact, I already have such a display, but it does not yet use pure pyinaturalist code to produce the numbers. Instead, it calculates them as per the workaround described below.

Workarounds

Here's an example API call and link to the observation on the web where the counts on the observation from the API don't match the votes as computed from the identifications on the web:

https://api.inaturalist.org/v1/observations/10075477
https://www.inaturalist.org/observations/10075477

On the web, it says "Cumulative IDs: 2 of 2" for Genus Gnorimoschema. If you count up the "current" identifications, that is accurate. However, the identifications_count and num_identification_agreements are both -1!

image

In the bot display shown below, the Cumulative IDs for this observation are represented as: 👥 (2/2)

image

The workaround used to compute the correct counts is based on the approach discussed in the forum thread linked above:

def obs_count_community_id(obs):
    idents_count = 0
    idents_agree = 0


    ident_taxon_ids = []
    # TODO: when pyinat supports ident_taxon_ids, this can be removed.
    for ident in obs.identifications:
        if ident.taxon:
            for ancestor_id in ident.taxon.ancestor_ids:
                if ancestor_id not in ident_taxon_ids:
                    ident_taxon_ids.append(ancestor_id)
            if ident.taxon.id not in ident_taxon_ids:
                ident_taxon_ids.append(ident.taxon.id)
    for identification in obs.identifications:
        if identification.current:
            user_taxon_id = identification.taxon.id
            user_taxon_ids = identification.taxon.ancestor_ids
            user_taxon_ids.append(user_taxon_id)
            if obs.community_taxon_id in user_taxon_ids:
                if user_taxon_id in ident_taxon_ids:
                    # Count towards total & agree:
                    idents_count += 1
                    idents_agree += 1
                else:
                    # Neither counts for nor against
                    pass
            else:
                # Maverick counts against:
                idents_count += 1

    return (idents_count, idents_agree)

Note: a side issue here is that pyinaturalist doesn't expose result[#]["ident_taxon_ids"] in the Observation object, so I start by simulating that field by collecting all taxon_id and taxon_ancestor_ids from the identifications (which honestly, I'm uncertain is the correct way to do it). If I had access to the obs.ident_taxon_ids, I would've eliminated that loop and used that attribute instead.

However, my only need for obs.ident_taxon_ids is to do this computation, so there may still be no reason to expose it if this feature request is delivered.