traitecoevo/traits.build

Add test ensuring a dataset can pivot wider

Closed this issue · 4 comments

ehwenk commented

Add a test to dataset_test() to check if the following code returns 0 rows. If it doesn't there is a source of duplication somewhere.

Note, this hasn't been changed to consider the newly revamped method_id and method_context_id. Once that is merged in, the variable method_id below needs to become method_context_id, and method_id needs to be added to the list, and 15:ncol(.) changed to 16:ncol(.)

austraits$traits %>%
  select(dataset_id, trait_name, value, observation_id, source_id, taxon_name, entity_type, life_stage, basis_of_record, value_type,
         population_id, individual_id, temporal_id, method_id, entity_context_id, original_name) %>%
  pivot_wider(names_from = trait_name, values_from = value, values_fn = length) %>%
  pivot_longer(cols = 15:ncol(.)) %>%
  rename(trait_name = name, number_of_duplicates = value) %>%
  select(dataset_id, taxon_name, trait_name, number_of_duplicates, observation_id, entity_type, value_type, population_id, everything()) %>%
  filter(number_of_duplicates > 1) %>%
 nrow()

Is this a duplicate of issue #8?

ehwenk commented

I see now that the content of issue #8 is very similar, but I'd assumed from the title it referred to something more specific. I'll close issue #8 - since I know this issue contains the current code that works. And documents the changes needed once the method_id branch is merged in.

I've added this to dataset_test on branch test_for_pivoting_wider but should it also be added to the GitHub actions?

ehwenk commented

Eventually yes, but I think it is probably good to test first at the dataset level - to ensure it is only triggered when we expect it to be.