/bam2cram-check

Primary LanguagePythonGNU General Public License v3.0GPL-3.0

Build Status codecov.io

bam2cram-check

This package is for checking that the file format convertion between a BAM and a CRAM leaves the data unaffected. The checks rely on the comparison between samtools stats' (CHK field), and on samtools flagstat. In addition to this both files are samtools quickcheck-ed. The stats are generated for both files. However, if there is a .stats or .flagstat file in the directory where each file is, then it will use those.

For running this you need:

python >= 3.5
samtools >=1.3

Usage:

python main.py -b <bam_file> -c <cram_file> -e <err_file> --log <log_file>

Or alternatively, there is also a shell script for checking a full directory of BAMs and CRAMs by submitting as a job to LSF for each pair of files converted:

./run_batch.sh <bam_dir> <cram_dir> <log_dir> <output_dir> <issues_dir>

where each BAM-CRAM conversion to be checked will have its own file in:

  • log_dir - for the logging all the commands ran and their results
  • output_dir - what is sent to stdout by the commands ran
  • issues_dir - what is sent to stderr by the commands ran

There is no need to create these dirs beforehands as the shell script creates them if they don't exist already.