> Header found! Skipping line...
Opened this issue · 16 comments
Hi @fgvieira,
I could not see what went wrong here. The only header in my data is at the first line of my beagle.gz file.
In which file would the supposed headers be?
Many thanks in advance
==> Input Arguments:
geno: /scratch/leuven/356/vsc35633/angsd/file.beagle.gz
probs: true
log_scale: false
n_ind: 100
n_sites: 14657309
pos: file.pos (WITHOUT header)
max_kb_dist (kb): 100
max_snp_dist: 0
min_maf: 0.000000
ignore_miss_data: false
call_geno: false
N_thresh: 0.000000
call_thresh: 0.000000
rnd_sample: 1.000000
seed: 1717663257
extend_out: false
out: /scratch/leuven/356/vsc35633/angsd/ngsld.out
n_threads: 36
verbose: 1
version: 1.2.1 (Jun 5 2024 @ 11:46:45)
==> GZIP input file (not BINARY)
> Reading data from file...
> Header found! Skipping line...
==> Calculating MAF for all sites...
==> Getting sites coordinates
> Header found! Skipping line...
> Header found! Skipping line...
> Header found! Skipping line...
> Header found! Skipping line...
> Header found! Skipping line...
> Header found! Skipping line...
> Header found! Skipping line...
> Header found! Skipping line...
> Header found! Skipping line...
and it kept outputting that line for 10 hours until I stopped it
~/angsd/angsd $ head file.pos
chrom1_ptg000009l_19308 0
chrom1_ptg000009l_36281 1
chrom1_ptg000009l_36436 2
chrom1_ptg000009l_36448 2
chrom1_ptg000009l_36458 2
chrom1_ptg000009l_36467 3
chrom1_ptg000009l_36469 2
chrom1_ptg000009l_36532 2
chrom1_ptg000009l_36547 3
chrom1_ptg000009l_36577 2
~/angsd/angsd $ cat file.pos | wc -l
14657309
~/angsd/angsd $ zcat file.beagle.gz | head -2
marker allele1 allele2 Ind0 Ind0 Ind0 Ind1 Ind1 Ind1 Ind2 Ind2 Ind2 Ind3 Ind3 Ind3 Ind4 Ind4 Ind4 Ind5 Ind5 Ind5 Ind6 Ind6 Ind6 Ind7 Ind7 Ind7 Ind8 Ind8 Ind8 Ind9 Ind9 Ind9 Ind10 Ind10 Ind10 Ind11 Ind11 Ind11 Ind12 Ind12 Ind12 Ind13 Ind13 Ind13 Ind14 Ind14 Ind14 Ind15 Ind15 Ind15 Ind16 Ind16 Ind16 Ind17 Ind17 Ind17 Ind18 Ind18 Ind18 Ind19 Ind19 Ind19 Ind20 Ind20 Ind20 Ind21 Ind21 Ind21 Ind22 Ind22 Ind22 Ind23 Ind23 Ind23 Ind24 Ind24 Ind24 Ind25 Ind25 Ind25 Ind26 Ind26 Ind26 Ind27 Ind27 Ind27 Ind28 Ind28 Ind28 Ind29 Ind29 Ind29 Ind30 Ind30 Ind30 Ind31 Ind31 Ind31 Ind32 Ind32 Ind32 Ind33 Ind33 Ind33 Ind34 Ind34 Ind34 Ind35 Ind35 Ind35 Ind36 Ind36 Ind36 Ind37 Ind37 Ind37 Ind38 Ind38 Ind38 Ind39 Ind39 Ind39 Ind40 Ind40 Ind40 Ind41 Ind41 Ind41 Ind42 Ind42 Ind42 Ind43 Ind43 Ind43 Ind44 Ind44 Ind44 Ind45 Ind45 Ind45 Ind46 Ind46 Ind46 Ind47 Ind47 Ind47 Ind48 Ind48 Ind48 Ind49 Ind49 Ind49 Ind50 Ind50 Ind50 Ind51 Ind51 Ind51 Ind52 Ind52 Ind52 Ind53 Ind53 Ind53 Ind54 Ind54 Ind54 Ind55 Ind55 Ind55 Ind56 Ind56 Ind56 Ind57 Ind57 Ind57 Ind58 Ind58 Ind58 Ind59 Ind59 Ind59 Ind60 Ind60 Ind60 Ind61 Ind61 Ind61 Ind62 Ind62 Ind62 Ind63 Ind63 Ind63 Ind64 Ind64 Ind64 Ind65 Ind65 Ind65 Ind66 Ind66 Ind66 Ind67 Ind67 Ind67 Ind68 Ind68 Ind68 Ind69 Ind69 Ind69 Ind70 Ind70 Ind70 Ind71 Ind71 Ind71 Ind72 Ind72 Ind72 Ind73 Ind73 Ind73 Ind74 Ind74 Ind74 Ind75 Ind75 Ind75 Ind76 Ind76 Ind76 Ind77 Ind77 Ind77 Ind78 Ind78 Ind78 Ind79 Ind79 Ind79 Ind80 Ind80 Ind80 Ind81 Ind81 Ind81 Ind82 Ind82 Ind82 Ind83 Ind83 Ind83 Ind84 Ind84 Ind84 Ind85 Ind85 Ind85 Ind86 Ind86 Ind86 Ind87 Ind87 Ind87 Ind88 Ind88 Ind88 Ind89 Ind89 Ind89 Ind90 Ind90 Ind90 Ind91 Ind91 Ind91 Ind92 Ind92 Ind92 Ind93 Ind93 Ind93 Ind94 Ind94 Ind94 Ind95 Ind95 Ind95 Ind96 Ind96 Ind96 Ind97 Ind97 Ind97 Ind98 Ind98 Ind98 Ind99 Ind99 Ind99
chrom1_ptg000009l_19308 0 2 0.666492 0.333243 0.000265 0.000000 0.058822 0.941178 0.000000 0.058822 0.941178 0.000000 0.000488 0.999512 0.000000 0.058822 0.941178 0.000000 0.015384 0.984616 0.000000 0.199997 0.800003 0.000000 0.111109 0.888891 0.001593 0.998405 0.000001 0.000000 0.000002 0.999998 0.000000 0.975110 0.024890 0.000000 0.195728 0.804272 0.000000 0.030302 0.969698 0.666580 0.333287 0.000133 0.000000 0.987397 0.012603 0.000000 0.015384 0.984616 0.333333 0.333333 0.333333 0.000000 0.001949 0.998051 0.000000 0.199997 0.800003 0.000133 0.333287 0.666580 0.000000 0.111109 0.888891 0.000000 0.000001 0.999999 0.000000 0.999357 0.000643 0.000000 0.000976 0.999024 0.000000 0.000488 0.999512 0.000133 0.333287 0.666580 0.000000 0.999994 0.000006 0.000000 0.030302 0.969698 0.000000 0.199997 0.800003 0.333333 0.333333 0.333333 0.000000 0.015384 0.984616 0.000000 0.058822 0.941178 0.000000 0.000000 1.000000 0.000000 0.000002 0.999998 0.000000 0.997476 0.002524 0.000000 0.015384 0.984616 0.000000 0.996762 0.003238 0.000034 0.999966 0.000000 0.000000 0.000488 0.999512 0.000000 0.003891 0.996109 0.984616 0.015384 0.000000 0.000000 0.000001 0.999999 0.000001 0.998405 0.001593 0.000001 0.998405 0.001593 0.000000 0.007751 0.992249 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.987397 0.012603 0.000000 0.000976 0.999024 0.000000 0.000000 1.000000 0.333333 0.333333 0.333333 0.000000 0.018789 0.981211 0.002104 0.332630 0.665266 0.333333 0.333333 0.333333 0.000000 0.000000 1.000000 0.000000 0.000015 0.999985 0.000000 0.003891 0.996109 0.000001 0.199997 0.800002 0.000000 0.000031 0.999969 0.000000 0.550532 0.449468 0.000133 0.333287 0.666580 0.333333 0.333333 0.333333 0.000000 1.000000 0.000000 0.333333 0.333333 0.333333 0.000000 1.000000 0.000000 0.666224 0.333109 0.000666 0.000000 0.007751 0.992249 0.000000 0.000976 0.999024 0.333333 0.333333 0.333333 0.000133 0.333287 0.666580 0.000000 0.199997 0.800003 0.000133 0.333287 0.666580 0.333333 0.333333 0.333333 0.000000 0.030302 0.969698 0.000000 0.199997 0.800002 0.000000 0.987465 0.012535 0.000000 0.007751 0.992249 0.000000 0.000122 0.999878 0.333333 0.333333 0.333333 0.333333 0.333333 0.333333 0.000000 0.001949 0.998051 0.000000 0.003891 0.996109 0.000000 0.000004 0.999996 0.000133 0.333287 0.666580 0.000000 0.000488 0.999512 0.000000 0.057580 0.942420 0.000000 0.999872 0.000128 0.000000 0.071134 0.928866 0.000000 0.001196 0.998804 0.333333 0.333333 0.333333 0.333333 0.333333 0.333333 0.333333 0.333333 0.333333 0.000000 0.939628 0.060372 0.000000 0.999916 0.000084 0.000000 0.886143 0.113857 0.012462 0.986751 0.000787 0.000000 0.984275 0.015725 0.000000 0.111109 0.888891 0.000000 0.993658 0.006342
$ zcat file.beagle.gz | wc -l
14657310
What is the delimiter/separator in your pos
file?
Thanks for the speedy reply. the pos file is tab separated and no spaces are present in the file.
The beagle file however had variable counts of spaces as separator. I fixed this to only have single tabs as separator, but the same error remains:
$ zcat file2.beagle.gz | head | grep -P '\t'
marker allele1 allele2 Ind0 Ind0 Ind0 Ind1 Ind1 Ind1 Ind2 Ind2 Ind2 Ind3 Ind3 Ind3 Ind4 Ind4 I
nd4 Ind5 Ind5 Ind5 Ind6 Ind6 Ind6 Ind7 Ind7 Ind7 Ind8 Ind8 Ind8 Ind9 Ind9 Ind9 Ind10 I
nd10 Ind10 Ind11 Ind11 Ind11 Ind12 Ind12 Ind12 Ind13 Ind13 Ind13 Ind14 Ind14 Ind14 Ind15 Ind15 Ind15 I
nd16 Ind16 Ind16 Ind17 Ind17 Ind17 Ind18 Ind18 Ind18 Ind19 Ind19 Ind19 Ind20 Ind20 Ind20 Ind21 Ind21 I
nd21 Ind22 Ind22 Ind22 Ind23 Ind23 Ind23 Ind24 Ind24 Ind24 Ind25 Ind25 Ind25 Ind26 Ind26 Ind26 Ind27 I
nd27 Ind27 Ind28 Ind28 Ind28 Ind29 Ind29 Ind29 Ind30 Ind30 Ind30 Ind31 Ind31 Ind31 Ind32 Ind32 Ind32 I
nd33 Ind33 Ind33 Ind34 Ind34 Ind34 Ind35 Ind35 Ind35 Ind36 Ind36 Ind36 Ind37 Ind37 Ind37 Ind38 Ind38 I
nd38 Ind39 Ind39 Ind39 Ind40 Ind40 Ind40 Ind41 Ind41 Ind41 Ind42 Ind42 Ind42 Ind43 Ind43 Ind43 Ind44 I
nd44 Ind44 Ind45 Ind45 Ind45 Ind46 Ind46 Ind46 Ind47 Ind47 Ind47 Ind48 Ind48 Ind48 Ind49 Ind49 Ind49 I
nd50 Ind50 Ind50 Ind51 Ind51 Ind51 Ind52 Ind52 Ind52 Ind53 Ind53 Ind53 Ind54 Ind54 Ind54 Ind55 Ind55 I
nd55 Ind56 Ind56 Ind56 Ind57 Ind57 Ind57 Ind58 Ind58 Ind58 Ind59 Ind59 Ind59 Ind60 Ind60 Ind60 Ind61 I
nd61 Ind61 Ind62 Ind62 Ind62 Ind63 Ind63 Ind63 Ind64 Ind64 Ind64 Ind65 Ind65 Ind65 Ind66 Ind66 Ind66 I
nd67 Ind67 Ind67 Ind68 Ind68 Ind68 Ind69 Ind69 Ind69 Ind70 Ind70 Ind70 Ind71 Ind71 Ind71 Ind72 Ind72 I
nd72 Ind73 Ind73 Ind73 Ind74 Ind74 Ind74 Ind75 Ind75 Ind75 Ind76 Ind76 Ind76 Ind77 Ind77 Ind77 Ind78 I
nd78 Ind78 Ind79 Ind79 Ind79 Ind80 Ind80 Ind80 Ind81 Ind81 Ind81 Ind82 Ind82 Ind82 Ind83 Ind83 Ind83 I
nd84 Ind84 Ind84 Ind85 Ind85 Ind85 Ind86 Ind86 Ind86 Ind87 Ind87 Ind87 Ind88 Ind88 Ind88 Ind89 Ind89 I
nd89 Ind90 Ind90 Ind90 Ind91 Ind91 Ind91 Ind92 Ind92 Ind92 Ind93 Ind93 Ind93 Ind94 Ind94 Ind94 Ind95 I
nd95 Ind95 Ind96 Ind96 Ind96 Ind97 Ind97 Ind97 Ind98 Ind98 Ind98 Ind99 Ind99 Ind99
chrom1_ptg000009l_19308 0 2 0.666492 0.333243 0.000265 0.000000 0.058822 0.941178 0
.000000 0.058822 0.941178 0.000000 0.000488 0.999512 0.000000 0.058822 0.941178 0
.000000 0.015384 0.984616 0.000000 0.199997 0.800003 0.000000 0.111109 0.888891 0
.001593 0.998405 0.000001 0.000000 0.000002 0.999998 0.000000 0.975110 0.024890 0
.000000 0.195728 0.804272 0.000000 0.030302 0.969698 0.666580 0.333287 0.000133 0
.000000 0.987397 0.012603 0.000000 0.015384 0.984616 0.333333 0.333333 0.333333 0
.000000 0.001949 0.998051 0.000000 0.199997 0.800003 0.000133 0.333287 0.666580 0
.000000 0.111109 0.888891 0.000000 0.000001 0.999999 0.000000 0.999357 0.000643 0
.000000 0.000976 0.999024 0.000000 0.000488 0.999512 0.000133 0.333287 0.666580 0
.000000 0.999994 0.000006 0.000000 0.030302 0.969698 0.000000 0.199997 0.800003 0
.333333 0.333333 0.333333 0.000000 0.015384 0.984616 0.000000 0.058822 0.941178 0
.000000 0.000000 1.000000 0.000000 0.000002 0.999998 0.000000 0.997476 0.002524 0
.000000 0.015384 0.984616 0.000000 0.996762 0.003238 0.000034 0.999966 0.000000 0
.000000 0.000488 0.999512 0.000000 0.003891 0.996109 0.984616 0.015384 0.000000 0
.000000 0.000001 0.999999 0.000001 0.998405 0.001593 0.000001 0.998405 0.001593 0
.000000 0.007751 0.992249 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0
.000000 0.000000 1.000000 0.000000 0.987397 0.012603 0.000000 0.000976 0.999024 0
.000000 0.000000 1.000000 0.333333 0.333333 0.333333 0.000000 0.018789 0.981211 0
.002104 0.332630 0.665266 0.333333 0.333333 0.333333 0.000000 0.000000 1.000000 0
.000000 0.000015 0.999985 0.000000 0.003891 0.996109 0.000001 0.199997 0.800002 0
.000000 0.000031 0.999969 0.000000 0.550532 0.449468 0.000133 0.333287 0.666580 0
.333333 0.333333 0.333333 0.000000 1.000000 0.000000 0.333333 0.333333 0.333333 0
.000000 1.000000 0.000000 0.666224 0.333109 0.000666 0.000000 0.007751 0.992249 0
.000000 0.000976 0.999024 0.333333 0.333333 0.333333 0.000133 0.333287 0.666580 0
.000000 0.199997 0.800003 0.000133 0.333287 0.666580 0.333333 0.333333 0.333333 0
.000000 0.030302 0.969698 0.000000 0.199997 0.800002 0.000000 0.987465 0.012535 0
.000000 0.007751 0.992249 0.000000 0.000122 0.999878 0.333333 0.333333 0.333333 0
.333333 0.333333 0.333333 0.000000 0.001949 0.998051 0.000000 0.003891 0.996109 0
.000000 0.000004 0.999996 0.000133 0.333287 0.666580 0.000000 0.000488 0.999512 0
.000000 0.057580 0.942420 0.000000 0.999872 0.000128 0.000000 0.071134 0.928866 0
.000000 0.001196 0.998804 0.333333 0.333333 0.333333 0.333333 0.333333 0.333333 0
.333333 0.333333 0.333333 0.000000 0.939628 0.060372 0.000000 0.999916 0.000084 0
.000000 0.886143 0.113857 0.012462 0.986751 0.000787 0.000000 0.984275 0.015725 0
.000000 0.111109 0.888891 0.000000 0.993658 0.006342
logs:
==> Input Arguments:
geno: /scratch/leuven/356/vsc35633/angsd/file2.beagle.gz
probs: true
log_scale: false
n_ind: 100
n_sites: 14657309
pos: file.pos (WITHOUT header)
max_kb_dist (kb): 100
max_snp_dist: 0
min_maf: 0.000000
ignore_miss_data: false
call_geno: false
N_thresh: 0.000000
call_thresh: 0.000000
rnd_sample: 1.000000
seed: 1717670231
extend_out: false
out: /scratch/leuven/356/vsc35633/angsd/ngsld.out
n_threads: 36
verbose: 1
version: 1.2.1 (Jun 5 2024 @ 11:46:45)
==> GZIP input file (not BINARY)
> Reading data from file...
> Header found! Skipping line...
==> Calculating MAF for all sites...
==> Getting sites coordinates
> Header found! Skipping line...
> Header found! Skipping line...
> Header found! Skipping line...
> Header found! Skipping line...
> Header found! Skipping line...
> Header found! Skipping line...
> Header found! Skipping line..
update: there is something wrong with the contents of the .pos file. I wil try to fix it and give another update
Solved.
For whoever might run in the same problem:
The pos file can be made by extracting the first column from the beagle file without header and then changing the '_' between the contig/chrom ID and the marker position.
Just extracting the first two columns of the beagle file won't work because the second column is the allele, not the marker position.
I should have checked my files better, sorry for bothering.
Hi I am having this same issue after going through the steps to adjust any delimiter issues. I have tried both genotype formats (.geno.gz from angsd) and probabilities (.beagle.gz). I have tried pos files from both with a tab and space delimiter. What is the correct delimiter to use and can anyone share the code they used to create the pos file?
Thanks!
The delimiter has so to be a tab (/t
).
You can split the first column of the beagle
file on underscores (easy if your contig/chr names do not have any).
Hi I've tried with a tab delimiter. Ive tried with both the beagle and the maf files. Is there anything else that has come up in the past?
can you paste here the position file you are using and the messages you are getting?
Hi my pos file looks like this:
scaffold1 442
scaffold1 458
scaffold1 816
scaffold1 817
scaffold1 821
scaffold1 823
scaffold1 827
scaffold1 834
scaffold1 842
scaffold1 847
My initial output looks like this:
==> Input Arguments:
geno: wild_cluster.list-am_GATK_238_0.05_plink.beagle.gz
probs: true
log_scale: false
n_ind: 26
n_sites: 11179722
pos: pos (WITHOUT header)
max_kb_dist (kb): 100
max_snp_dist: 0
min_maf: 0.000000
ignore_miss_data: false
call_geno: false
N_thresh: 0.000000
call_thresh: 0.000000
rnd_sample: 1.000000
seed: 1720282292
extend_out: false
out: wild_fulleri_global.LD
n_threads: 1
verbose: 1
version: 1.2.1 (Jun 21 2024 @ 10:41:09)
==> GZIP input file (not BINARY)
> Reading data from file...
And then I get the warning:
Reading data from file...
> Header found! Skipping line...
Which continues for each line.
How many lines in the pos
and beagle
files?
How many columns in the beagle file?
The beagle has 81 columns, with 25 indiviudals.
The beagle has 11179723 rows and the pos file has 11179722
on you ngsLD
run you used --n_ind 26
.
Sorry there are 26 individuals - the first is encoded as Ind0 so I underestimated.
Can you send me a small example (of both beagle and pos) so I can reproduce the error?
Thanks for all the help so far!
Your files have different number of positions (1000 vs 3000) but, if you re-generate the pos
file:
zcat test.beagle.gz | cut -f 1 | tail -n +2 | sed 's/_/\t/' > test.pos.txt
and run:
ngsLD --n_threads 10 --geno test.beagle.gz --probs --n_ind 26 --n_sites 2999 --pos test.pos.txt > /dev/null
it runs fine. Can you send me the command you are using?