pha4ge/hAMRonization

Updating version support for starmar, amrfinderplus, kmerresistance - summarize output breaks

Opened this issue · 4 comments

Hi there,

Appreciate the time taken to build and maintain this tool!

Describe the bug
I ran hamronize summarize on hamronize reports from four tools (abricate, staramr, amrfinderplus, kmerresistance). Summarize appreas to complete successfully but outputs a file with 100 column headers instead of the 36 I'd expect. Each tool outputs on a different subset of columns, which suggests that hamronize has failed for three tools, two of which I'd like to request updates for as there are newer versions available (staramr and amrfinderplus) and one in which the version is supported but the summarize output looks like something has gone wrong (kmerresistance).

Input
Using outputs from harmonize abricate, hamronize staramr, etc. from the following tools and versions:

  • abricate v1.0.1 (although v1.0.0 is supported in the README, I had no issues with this version)
  • staramr v0.10.0 (latest version supported in README: v0.8.0, current version: v0.10.0)
  • amrfinderplus v3.11.14 (latest version supported in README: v3.10.40, current version: v3.12.8)
  • kmerresistance v2.2.0 (although this version is supported in the README, this failed to summarize)

Input file
I've attached the four hamronize reports labeled by tool and the final summary report summarize.txt. The header line is missing from all but the abricate report, suggesting there is something wrong with processing.

hamronize_abricate.txt
hamronize_amrfinder.txt
hamronize_kmerresistance.txt
hamronize_staramr.txt
summarize.txt

Error log
There are no errors when run

hAMRonization Version
v1.1.4

Expected behavior
I expect a concise summary report with 36 columns containing all tool data, not unique columns for each report.

Hi, thanks for flagging this, it seems like there is an issue with generating headers.

The intended behaviour is that if multiple reports are passed to hamronize at once that only the first report writes a header. This is to allow a user to pass say 10 reports in and pipe them all to the same output file. Clearly something is going wrong here though.

Just so I fix the right thing, can you clarify exactly how what command you are running for hamronize for one of the outputs missing a head?

For sure - those without headers have been run with the following command on a folder of multiple reports such as:

hamronize amrfinderplus amr_amrfinder/*.tsv --output hamronize_amrfinder --input_file_name amrfinderplus --analysis_software_version v3.11.14 --reference_database_version 2023-11-15.1

I tried running hamronize on a single report from each tool listed above and the summarize output does look at expected (attached)
results_single.txt

Weird, I'm struggling to reproduce the behaviour with the AFP outputs I have to hand.

Could you clone the repository and enter it:

git clone https://github.com/pha4ge/hAMRonization; cd hAMRonization

Then run the following command:

hamronize amrfinderplus test/data/dummy/amrfinderplus/report.tsv test/data/raw_outputs/amrfinderplus/report_nucleotide.tsv test/data/raw_outputs/amrfinderplus/report_protein.tsv --output hamronize_test --input_file_name amrfinderplus --analysis_software_version v3.11.14 --reference_database_version 2023-11-15.1

Does hamronize_test have a header?

Also does the same behaviour happen if you use the json output format and summarize those, e.g.:

hamronize amrfinderplus amr_amrfinder/*.tsv --output hamronize_amrfinder.json --input_file_name amrfinderplus --analysis_software_version v3.11.14 --reference_database_version 2023-11-15.1 --format json

hamronize summarize hamronize_amrfinder.json

Re: hamronize_test - it looks like the header was repeated once for each input, output attached
hamronize_test.txt

Re: json - the json file appears correct as far as I can tell (attached), but the summarize command produces only a header, no data
hamronize_amrfinder.json

Also attached a few of my outputs from amrfinderplus if that's helpful
20231115_sample03_amrfinder.txt
20231128_sample02_amrfinder.txt