make_tree: tree with omitted ranks has out of order children
Closed this issue · 2 comments
The problem
When making a taxon tree from a life list that has some ranks omitted, i.e. include_ranks does not have all intermediate ranks from the root down to the leaves of the tree being produced from it, then children are still grouped and sorted by the omitted ranks, making them appear out of order.
Expected behavior
The expected order for my example below is:
Family Andrenidae
└── Genus Andrena
├── Andrena clarkella
├── Andrena crataegi
├── Andrena dunningi
├── Andrena frigida
├── Andrena milwaukeensis
├── Andrena nubecula
└── Andrena wilkella
Steps to reproduce the behavior
>>> from pyinat import iNatClient, make_tree, pprint_tree
>>> client = iNatClient()
>>> life_list = client.observations.life_list(user_id=545640, taxon_id=57669)
>>> tree = make_tree(life_list.data, include_ranks=['family','genus','species'])
>>> pprint_tree(tree)
Family Andrenidae
└── Genus Andrena
├── Andrena clarkella
├── Andrena frigida
├── Andrena milwaukeensis
├── Andrena nubecula
├── Andrena dunningi
├── Andrena crataegi
└── Andrena wilkella
>>> tree.flatten()
[
TaxonCount(id=57668, iconic_taxon_name='Unknown', is_active=True, name='Andrenidae', parent_id=630955, rank_level=30, rank='family', descendant_obs_count=111),
TaxonCount(id=57669, iconic_taxon_name='Unknown', is_active=True, name='Andrena', parent_id=958234, rank_level=20, rank='genus', count=49, descendant_obs_count=111),
TaxonCount(id=198998, iconic_taxon_name='Unknown', is_active=True, name='Andrena clarkella', parent_id=571358, rank_level=10, rank='species', count=7, descendant_obs_count=7),
TaxonCount(id=198991, iconic_taxon_name='Unknown', is_active=True, name='Andrena frigida', parent_id=571358, rank_level=10, rank='species', count=1, descendant_obs_count=1),
TaxonCount(id=198981, iconic_taxon_name='Unknown', is_active=True, name='Andrena milwaukeensis', parent_id=571358, rank_level=10, rank='species', count=7, descendant_obs_count=7),
TaxonCount(id=198973, iconic_taxon_name='Unknown', is_active=True, name='Andrena nubecula', parent_id=571188, rank_level=10, rank='species', count=5, descendant_obs_count=5),
TaxonCount(id=198997, iconic_taxon_name='Unknown', is_active=True, name='Andrena dunningi', parent_id=571409, rank_level=10, rank='species', count=2, descendant_obs_count=2),
TaxonCount(id=199011, iconic_taxon_name='Unknown', is_active=True, name='Andrena crataegi', parent_id=571426, rank_level=10, rank='species', count=5, descendant_obs_count=5),
TaxonCount(id=127785, iconic_taxon_name='Unknown', is_active=True, name='Andrena wilkella', parent_id=571443, rank_level=10, rank='species', count=12, descendant_obs_count=12)
]
>>>
The problem arises from the subgenera under Andrena not being included. The order of the children is correct with respect to each subgenus, but as those aren't present in the resulting tree, the children are out of order so far as the user can see.
Workarounds
Traverse the tree to build groups with children sorted in the desired alphabetic order, but this is at the expense of losing the convenience offered by pprint_tree()
and flatten()
which handle the tree traversal transparently.
Environment
- OS & version: Debian 10
- Python version: 3.11
- Pyinaturalist version or branch: main
Thanks for the detailed bug report! I will get that fixed soon.
Fixed!