number of populations outputted by 05_filter_vcf_fast.py doesn't match my popmap file
spencer411 opened this issue · 2 comments
I am attempting to use 05_filter_vcf_fast.py to filter a VCF file I created with stacks. For the dataset I am working on, I designated only a single population, as I am trying to determine population structure with admixture down the line. Nevertheless, when I run 05_filter_vcf_fast.py it tells me:
-bash-4.2$ python3 05_filter_vcf_fast.py all_no_pops.snps.vcf 1 50 0 2 all_no_pops_filtered_m4_p70_x0_S2.vcf
293 samples in 122 populations
Where is this number 122 coming from, given that the popmap file I used gives all samples the same population designation? Note that if I open something like the structure file I produced with this same run, the population designation is correct for all samples (e.g. 1), so I am guessing it's not an issue with my popmap file used in the stacks portion of the workflow.
Hi. This script requires sample names to be of the following form: population_sample
. It does not use the population map file.
You must have 122 different strings before the underscore (_
).
Thanks that makes sense!