globaldothealth/monkeypox-report

Code review

Closed this issue · 3 comments

Can't do it in the usual UI as there's no pull request to reference, so here's the points I observed.

  • assert all(url.endswith("csv") for url in build.get_archives_list("csv"))
    consider using monkeypatch to substitute requests.get here, or any problem accessing GitHub will make the test runner exit.
  • logging.error("Failed to get archives list, aborting")
    (and elsewhere this pattern appears) raise an exception and log/exit in if __name__ == '__main__' so that you can also test unhappy paths.
  • def test_counts():
    there are a lot of numbers the same in this comparison: how do I know that n_countries_suspected_only is 1 because it correctly counted those, not because it actually used the same logic for n_countries_discarded? My preference would be to introduce functions that calculate the summary statistics separately and test that they work correctly, which might help with:
  • countries_suspected_or_confirmed = set(
    there's a bit of unnecessary complexity making this function hard to read: sometimes sets are calculated as unions/intersections of other sets; sometimes not; sometimes the same set is calculated multiple times; many of them are assigned values in L166-175 but then others are inlined in the return value or even walrus-assigned in the return value and used again on a subsequent line. Maybe refactoring toward helper functions would make it easier to understand what's going on: this is probably the place where errors like including the wrong status in the calculation of a statistic will occur.
  • def mid_bucket_age(age_interval: str) -> float:
    I don't understand why the mid-bucket-age of "50" isn't 50. I'm happy for that to be correct behaviour, but if so could you write a docstring explaining it please? :)
  • 51 - 60, 61 - 70, 71 - 80, 81 -
    in G.h we have an upper bound of 120 on age, which is not quite the oldest a human ever gets but acts as a QC check against typos. Worth implementing here?
  • the negative (raise ValueError) case isn't tested.
  • return {
    nit: there's a function to calculate occurrences as percentages from data frames that can be extracted here.
  • there is a whole lot of setup code in the first 101 lines of test_build.py that seems to mostly generate dictionaries with random content that are used in a two-line test file. Could that be replaced with a constant dictionary, maybe loaded from json or yaml?
  • genome_data = genome_data[genome_data.date >= "2022-05"]
    this seems like a big inconsistency: if we don't know which cases are associated with the outbreak then we will consider all cases after May 2022, which includes endemic cases (though maybe not all of them; I don't know if the genomic data is only the West African clade).

Thanks @iamleeg! Great suggestion on refactoring counts(). I will add the age upper limit, not sure how to handle the genome data, restricting it to a clade maybe a good approximation.

@abhidg also consider stating which methodology was used for genomic data in the reports containing the figures, so people know whether they can do like-for-like comparisons.

Really insightful comments. I will add the upper bound age to the figure script as well.

It would be worth discussing how to handle the genome data on tomorrow's call.