uclahs-cds/package-PipeVal

Add function to get chromosome naming convention

Opened this issue · 1 comments

Add a function to get the variant chr naming convention of a single bed/vcf file. Only checks the chromosome naming convention (chr1 vs 1). I think this is related to the assembly used.
Used in RecSNV, originally called vcf_checker and bed_checker, where the purpose seems to be to ensure the files are all using the same naming convention and that the naming connection matches a FASTA file. See validate_inputs function.

See code and discussion in https://github.com/uclahs-cds/pipeline-RecSNV/pull/17#discussion_r1082963532

I was thinking a function that gets the naming convention for a single file that way the comparison between them is able to be done in nextflow and reuse the same channel instead of creating a new one for a directory.
There can be a function to compare the outputs of each file that is can be called when it's used outside nextflow that would perform the same function to compare all the files in the specified directory.