markrobinsonuzh/DAMEfinder

Cannot read the VCF file

Closed this issue · 5 comments

hcph commented

When I use this Command:
bam_files <- c("7Z-1.test.bam","7Z-2.test.bam")
vcf_files <- c("get.vcf","get.vcf")
sample_names <- c("7Z-1", "7Z-2")
reference_file <- ("/public/home/chaohe/db/CS/CS_genome/CSchr1A.fa")
snp.list <- extract_bams(bam_files, vcf_files, sample_names, reference_file, coverage = 2)
I got the error:
Reading reference file
Running sample 7Z-1
Reading VCF file
Extracting methylation per SNP
Error in character(nchar(a)) : vector size cannot be NA

I do not know what happened, could you please help me? Thanks a lot!

File detail
7Z-1.test.bam :
image
get.vcf:
image
CSchr1A.fa:
image

Hi, looks like your bam files don't have an MD tag. What aligner are you using?

Can you run these steps and let me know what output you get?

#choose the first SNP from your VCF file
loc <- GRanges(seqnames = "chr1A", 
               IRanges(start = 1926, 
                       width = 1))

#select parameters to read in
params <- ScanBamParam(tag = c("MD", "XM", "XR", "XG"), which = loc)

# read in reads overlapping snp
alns <- readGAlignments(bam_files[1], 
                            use.names = TRUE, 
                            param = params)

Please paste the output for alns, just like the first 3 rows is enough.

hcph commented

Hi, looks like your bam files don't have an MD tag. What aligner are you using?

Can you run these steps and let me know what output you get?

#choose the first SNP from your VCF file
loc <- GRanges(seqnames = "chr1A", 
               IRanges(start = 1926, 
                       width = 1))

#select parameters to read in
params <- ScanBamParam(tag = c("MD", "XM", "XR", "XG"), which = loc)

# read in reads overlapping snp
alns <- readGAlignments(bam_files[1], 
                            use.names = TRUE, 
                            param = params)

Please paste the output for alns, just like the first 3 rows is enough.

Thank you for such quickly reply. I use BS-seeker2 to analysis the WGBS datasets, which based on the aligner of Bowtie2. And the output of 'alns' was:
image
The first 1 rows were :
image
If I directly paste the result, it was hard to read, so I upload the picture, hope it will not cause you trouble. Thank you again.

Thanks, yes so indeed there are no MD tags. It's a bit strange that BS-seeker removes them. Maybe there's an option to keep them?

I've only tested the package with bam files from Bismark So you can try to align with that and try again.

hcph commented

Thanks, yes so indeed there are no MD tags. It's a bit strange that BS-seeker removes them. Maybe there's an option to keep them?

I've only tested the package with bam files from Bismark So you can try to align with that and try again.

So sad to hear that. If I rerun the alignment step, maybe It will take about two months. The genome is wheat (16 Gb), crying..... Anyway, thank you very much!

hcph commented