knausb/vcfR

change genotype(change gap to NA)

Closed this issue · 2 comments

Hello, thank you for the convenient tool. I have a question.

In my vcf file, there are '*', which mean gap.

For example,
REF ATL
A T,*

In genotype fields, 0=A, 1=T, 2=*(gap).

I'd like to treat gap site to missing site.
So, I'd like to replace 2 to ".".

How could I do this?

Best regrds,

Hi @snackens , I'm afraid I do not understand what you're trying to accomplish here. When I find myself in this situation I tend to start with the VCF specification (http://samtools.github.io/hts-specs/), we appear to be at v4.4 today, so that's what I'll cite.

I do not believe the VCF specification has a concept for 'gap'. I queried the document and found no mention of 'gap'. In section 1.1 there is an example that includes a microsatellite. Here I think we can interpret this as a length polymorphism/insertion/deletion/gap, etc. The complete sequence for the reference and all alternate alleles are presented. This means that there is no need for a 'symbol' to represent a gap. Because I feel that 'gap' doesn't exist in the specification I am concerned that your question does not make sense.

In section 1.6.1 there is mention of using an asterisk as a 'symbolic allele' which I have no experience with. But it does not sound like a gap to me.

Could you please review this information and clarify what you're attempting to accomplish?
Thanks!
Brian

Thank you for your reply.
I have solved this question. I found out that what I tried to do is something strange.
Thank you for giving advice a lot and this excellent tool.

Best regards,