fgvieira/ngsLD

> Header found! Skipping line...

Opened this issue · 16 comments

Hi @fgvieira,

I could not see what went wrong here. The only header in my data is at the first line of my beagle.gz file.
In which file would the supposed headers be?

Many thanks in advance

==> Input Arguments:
        geno: /scratch/leuven/356/vsc35633/angsd/file.beagle.gz
        probs: true
        log_scale: false
        n_ind: 100
        n_sites: 14657309
        pos: file.pos (WITHOUT header)
        max_kb_dist (kb): 100
        max_snp_dist: 0
        min_maf: 0.000000
        ignore_miss_data: false
        call_geno: false
        N_thresh: 0.000000
        call_thresh: 0.000000
        rnd_sample: 1.000000
        seed: 1717663257
        extend_out: false
        out: /scratch/leuven/356/vsc35633/angsd/ngsld.out
        n_threads: 36
        verbose: 1
        version: 1.2.1 (Jun  5 2024 @ 11:46:45)

==> GZIP input file (not BINARY)
> Reading data from file...
> Header found! Skipping line...
==> Calculating MAF for all sites...
==> Getting sites coordinates
> Header found! Skipping line...
> Header found! Skipping line...
> Header found! Skipping line...
> Header found! Skipping line...
> Header found! Skipping line...
> Header found! Skipping line...
> Header found! Skipping line...
> Header found! Skipping line...
> Header found! Skipping line...

and it kept outputting that line for 10 hours until I stopped it

 ~/angsd/angsd $ head file.pos 
chrom1_ptg000009l_19308 0
chrom1_ptg000009l_36281 1
chrom1_ptg000009l_36436 2
chrom1_ptg000009l_36448 2
chrom1_ptg000009l_36458 2
chrom1_ptg000009l_36467 3
chrom1_ptg000009l_36469 2
chrom1_ptg000009l_36532 2
chrom1_ptg000009l_36547 3
chrom1_ptg000009l_36577 2
~/angsd/angsd $ cat file.pos | wc -l
14657309
~/angsd/angsd $ zcat file.beagle.gz | head -2
marker  allele1 allele2 Ind0    Ind0    Ind0    Ind1    Ind1    Ind1    Ind2    Ind2    Ind2    Ind3    Ind3    Ind3    Ind4    Ind4    Ind4    Ind5      Ind5    Ind5    Ind6    Ind6    Ind6    Ind7    Ind7    Ind7    Ind8    Ind8    Ind8    Ind9    Ind9    Ind9    Ind10   Ind10   Ind10   Ind11     Ind11   Ind11   Ind12   Ind12   Ind12   Ind13   Ind13   Ind13   Ind14   Ind14   Ind14   Ind15   Ind15   Ind15   Ind16   Ind16   Ind16   Ind17     Ind17   Ind17   Ind18   Ind18   Ind18   Ind19   Ind19   Ind19   Ind20   Ind20   Ind20   Ind21   Ind21   Ind21   Ind22   Ind22   Ind22   Ind23     Ind23   Ind23   Ind24   Ind24   Ind24   Ind25   Ind25   Ind25   Ind26   Ind26   Ind26   Ind27   Ind27   Ind27   Ind28   Ind28   Ind28   Ind29     Ind29   Ind29   Ind30   Ind30   Ind30   Ind31   Ind31   Ind31   Ind32   Ind32   Ind32   Ind33   Ind33   Ind33   Ind34   Ind34   Ind34   Ind35     Ind35   Ind35   Ind36   Ind36   Ind36   Ind37   Ind37   Ind37   Ind38   Ind38   Ind38   Ind39   Ind39   Ind39   Ind40   Ind40   Ind40   Ind41     Ind41   Ind41   Ind42   Ind42   Ind42   Ind43   Ind43   Ind43   Ind44   Ind44   Ind44   Ind45   Ind45   Ind45   Ind46   Ind46   Ind46   Ind47     Ind47   Ind47   Ind48   Ind48   Ind48   Ind49   Ind49   Ind49   Ind50   Ind50   Ind50   Ind51   Ind51   Ind51   Ind52   Ind52   Ind52   Ind53     Ind53   Ind53   Ind54   Ind54   Ind54   Ind55   Ind55   Ind55   Ind56   Ind56   Ind56   Ind57   Ind57   Ind57   Ind58   Ind58   Ind58   Ind59     Ind59   Ind59   Ind60   Ind60   Ind60   Ind61   Ind61   Ind61   Ind62   Ind62   Ind62   Ind63   Ind63   Ind63   Ind64   Ind64   Ind64   Ind65     Ind65   Ind65   Ind66   Ind66   Ind66   Ind67   Ind67   Ind67   Ind68   Ind68   Ind68   Ind69   Ind69   Ind69   Ind70   Ind70   Ind70   Ind71     Ind71   Ind71   Ind72   Ind72   Ind72   Ind73   Ind73   Ind73   Ind74   Ind74   Ind74   Ind75   Ind75   Ind75   Ind76   Ind76   Ind76   Ind77     Ind77   Ind77   Ind78   Ind78   Ind78   Ind79   Ind79   Ind79   Ind80   Ind80   Ind80   Ind81   Ind81   Ind81   Ind82   Ind82   Ind82   Ind83     Ind83   Ind83   Ind84   Ind84   Ind84   Ind85   Ind85   Ind85   Ind86   Ind86   Ind86   Ind87   Ind87   Ind87   Ind88   Ind88   Ind88   Ind89     Ind89   Ind89   Ind90   Ind90   Ind90   Ind91   Ind91   Ind91   Ind92   Ind92   Ind92   Ind93   Ind93   Ind93   Ind94   Ind94   Ind94   Ind95     Ind95   Ind95   Ind96   Ind96   Ind96   Ind97   Ind97   Ind97   Ind98   Ind98   Ind98   Ind99   Ind99   Ind99
chrom1_ptg000009l_19308 0       2       0.666492        0.333243        0.000265        0.000000        0.058822        0.941178        0.000000 0.058822 0.941178        0.000000        0.000488        0.999512        0.000000        0.058822        0.941178        0.000000        0.015384 0.984616 0.000000        0.199997        0.800003        0.000000        0.111109        0.888891        0.001593        0.998405        0.000001 0.000000 0.000002        0.999998        0.000000        0.975110        0.024890        0.000000        0.195728        0.804272        0.000000 0.030302 0.969698        0.666580        0.333287        0.000133        0.000000        0.987397        0.012603        0.000000        0.015384 0.984616 0.333333        0.333333        0.333333        0.000000        0.001949        0.998051        0.000000        0.199997        0.800003 0.000133 0.333287        0.666580        0.000000        0.111109        0.888891        0.000000        0.000001        0.999999        0.000000 0.999357 0.000643        0.000000        0.000976        0.999024        0.000000        0.000488        0.999512        0.000133        0.333287 0.666580 0.000000        0.999994        0.000006        0.000000        0.030302        0.969698        0.000000        0.199997        0.800003 0.333333 0.333333        0.333333        0.000000        0.015384        0.984616        0.000000        0.058822        0.941178        0.000000 0.000000 1.000000        0.000000        0.000002        0.999998        0.000000        0.997476        0.002524        0.000000        0.015384 0.984616 0.000000        0.996762        0.003238        0.000034        0.999966        0.000000        0.000000        0.000488        0.999512 0.000000 0.003891        0.996109        0.984616        0.015384        0.000000        0.000000        0.000001        0.999999        0.000001 0.998405 0.001593        0.000001        0.998405        0.001593        0.000000        0.007751        0.992249        0.000000        0.000000 1.000000 0.000000        0.000000        1.000000        0.000000        0.000000        1.000000        0.000000        0.987397        0.012603 0.000000 0.000976        0.999024        0.000000        0.000000        1.000000        0.333333        0.333333        0.333333        0.000000 0.018789 0.981211        0.002104        0.332630        0.665266        0.333333        0.333333        0.333333        0.000000        0.000000 1.000000 0.000000        0.000015        0.999985        0.000000        0.003891        0.996109        0.000001        0.199997        0.800002 0.000000 0.000031        0.999969        0.000000        0.550532        0.449468        0.000133        0.333287        0.666580        0.333333 0.333333 0.333333        0.000000        1.000000        0.000000        0.333333        0.333333        0.333333        0.000000        1.000000 0.000000 0.666224        0.333109        0.000666        0.000000        0.007751        0.992249        0.000000        0.000976        0.999024 0.333333 0.333333        0.333333        0.000133        0.333287        0.666580        0.000000        0.199997        0.800003        0.000133 0.333287 0.666580        0.333333        0.333333        0.333333        0.000000        0.030302        0.969698        0.000000        0.199997 0.800002 0.000000        0.987465        0.012535        0.000000        0.007751        0.992249        0.000000        0.000122        0.999878 0.333333 0.333333        0.333333        0.333333        0.333333        0.333333        0.000000        0.001949        0.998051        0.000000 0.003891 0.996109        0.000000        0.000004        0.999996        0.000133        0.333287        0.666580        0.000000        0.000488 0.999512 0.000000        0.057580        0.942420        0.000000        0.999872        0.000128        0.000000        0.071134        0.928866 0.000000 0.001196        0.998804        0.333333        0.333333        0.333333        0.333333        0.333333        0.333333        0.333333 0.333333 0.333333        0.000000        0.939628        0.060372        0.000000        0.999916        0.000084        0.000000        0.886143 0.113857 0.012462        0.986751        0.000787        0.000000        0.984275        0.015725        0.000000        0.111109        0.888891 0.000000 0.993658        0.006342
$ zcat file.beagle.gz | wc -l
14657310

What is the delimiter/separator in your pos file?

Thanks for the speedy reply. the pos file is tab separated and no spaces are present in the file.

The beagle file however had variable counts of spaces as separator. I fixed this to only have single tabs as separator, but the same error remains:

$ zcat file2.beagle.gz | head | grep -P '\t'
marker  allele1 allele2 Ind0    Ind0    Ind0    Ind1    Ind1    Ind1    Ind2    Ind2    Ind2    Ind3    Ind3    Ind3    Ind4    Ind4    I
nd4     Ind5    Ind5    Ind5    Ind6    Ind6    Ind6    Ind7    Ind7    Ind7    Ind8    Ind8    Ind8    Ind9    Ind9    Ind9    Ind10   I
nd10    Ind10   Ind11   Ind11   Ind11   Ind12   Ind12   Ind12   Ind13   Ind13   Ind13   Ind14   Ind14   Ind14   Ind15   Ind15   Ind15   I
nd16    Ind16   Ind16   Ind17   Ind17   Ind17   Ind18   Ind18   Ind18   Ind19   Ind19   Ind19   Ind20   Ind20   Ind20   Ind21   Ind21   I
nd21    Ind22   Ind22   Ind22   Ind23   Ind23   Ind23   Ind24   Ind24   Ind24   Ind25   Ind25   Ind25   Ind26   Ind26   Ind26   Ind27   I
nd27    Ind27   Ind28   Ind28   Ind28   Ind29   Ind29   Ind29   Ind30   Ind30   Ind30   Ind31   Ind31   Ind31   Ind32   Ind32   Ind32   I
nd33    Ind33   Ind33   Ind34   Ind34   Ind34   Ind35   Ind35   Ind35   Ind36   Ind36   Ind36   Ind37   Ind37   Ind37   Ind38   Ind38   I
nd38    Ind39   Ind39   Ind39   Ind40   Ind40   Ind40   Ind41   Ind41   Ind41   Ind42   Ind42   Ind42   Ind43   Ind43   Ind43   Ind44   I
nd44    Ind44   Ind45   Ind45   Ind45   Ind46   Ind46   Ind46   Ind47   Ind47   Ind47   Ind48   Ind48   Ind48   Ind49   Ind49   Ind49   I
nd50    Ind50   Ind50   Ind51   Ind51   Ind51   Ind52   Ind52   Ind52   Ind53   Ind53   Ind53   Ind54   Ind54   Ind54   Ind55   Ind55   I
nd55    Ind56   Ind56   Ind56   Ind57   Ind57   Ind57   Ind58   Ind58   Ind58   Ind59   Ind59   Ind59   Ind60   Ind60   Ind60   Ind61   I
nd61    Ind61   Ind62   Ind62   Ind62   Ind63   Ind63   Ind63   Ind64   Ind64   Ind64   Ind65   Ind65   Ind65   Ind66   Ind66   Ind66   I
nd67    Ind67   Ind67   Ind68   Ind68   Ind68   Ind69   Ind69   Ind69   Ind70   Ind70   Ind70   Ind71   Ind71   Ind71   Ind72   Ind72   I
nd72    Ind73   Ind73   Ind73   Ind74   Ind74   Ind74   Ind75   Ind75   Ind75   Ind76   Ind76   Ind76   Ind77   Ind77   Ind77   Ind78   I
nd78    Ind78   Ind79   Ind79   Ind79   Ind80   Ind80   Ind80   Ind81   Ind81   Ind81   Ind82   Ind82   Ind82   Ind83   Ind83   Ind83   I
nd84    Ind84   Ind84   Ind85   Ind85   Ind85   Ind86   Ind86   Ind86   Ind87   Ind87   Ind87   Ind88   Ind88   Ind88   Ind89   Ind89   I
nd89    Ind90   Ind90   Ind90   Ind91   Ind91   Ind91   Ind92   Ind92   Ind92   Ind93   Ind93   Ind93   Ind94   Ind94   Ind94   Ind95   I
nd95    Ind95   Ind96   Ind96   Ind96   Ind97   Ind97   Ind97   Ind98   Ind98   Ind98   Ind99   Ind99   Ind99
chrom1_ptg000009l_19308 0       2       0.666492        0.333243        0.000265        0.000000        0.058822        0.941178        0
.000000 0.058822        0.941178        0.000000        0.000488        0.999512        0.000000        0.058822        0.941178        0
.000000 0.015384        0.984616        0.000000        0.199997        0.800003        0.000000        0.111109        0.888891        0
.001593 0.998405        0.000001        0.000000        0.000002        0.999998        0.000000        0.975110        0.024890        0
.000000 0.195728        0.804272        0.000000        0.030302        0.969698        0.666580        0.333287        0.000133        0
.000000 0.987397        0.012603        0.000000        0.015384        0.984616        0.333333        0.333333        0.333333        0
.000000 0.001949        0.998051        0.000000        0.199997        0.800003        0.000133        0.333287        0.666580        0
.000000 0.111109        0.888891        0.000000        0.000001        0.999999        0.000000        0.999357        0.000643        0
.000000 0.000976        0.999024        0.000000        0.000488        0.999512        0.000133        0.333287        0.666580        0
.000000 0.999994        0.000006        0.000000        0.030302        0.969698        0.000000        0.199997        0.800003        0
.333333 0.333333        0.333333        0.000000        0.015384        0.984616        0.000000        0.058822        0.941178        0
.000000 0.000000        1.000000        0.000000        0.000002        0.999998        0.000000        0.997476        0.002524        0
.000000 0.015384        0.984616        0.000000        0.996762        0.003238        0.000034        0.999966        0.000000        0
.000000 0.000488        0.999512        0.000000        0.003891        0.996109        0.984616        0.015384        0.000000        0
.000000 0.000001        0.999999        0.000001        0.998405        0.001593        0.000001        0.998405        0.001593        0
.000000 0.007751        0.992249        0.000000        0.000000        1.000000        0.000000        0.000000        1.000000        0
.000000 0.000000        1.000000        0.000000        0.987397        0.012603        0.000000        0.000976        0.999024        0
.000000 0.000000        1.000000        0.333333        0.333333        0.333333        0.000000        0.018789        0.981211        0
.002104 0.332630        0.665266        0.333333        0.333333        0.333333        0.000000        0.000000        1.000000        0
.000000 0.000015        0.999985        0.000000        0.003891        0.996109        0.000001        0.199997        0.800002        0
.000000 0.000031        0.999969        0.000000        0.550532        0.449468        0.000133        0.333287        0.666580        0
.333333 0.333333        0.333333        0.000000        1.000000        0.000000        0.333333        0.333333        0.333333        0
.000000 1.000000        0.000000        0.666224        0.333109        0.000666        0.000000        0.007751        0.992249        0
.000000 0.000976        0.999024        0.333333        0.333333        0.333333        0.000133        0.333287        0.666580        0
.000000 0.199997        0.800003        0.000133        0.333287        0.666580        0.333333        0.333333        0.333333        0
.000000 0.030302        0.969698        0.000000        0.199997        0.800002        0.000000        0.987465        0.012535        0
.000000 0.007751        0.992249        0.000000        0.000122        0.999878        0.333333        0.333333        0.333333        0
.333333 0.333333        0.333333        0.000000        0.001949        0.998051        0.000000        0.003891        0.996109        0
.000000 0.000004        0.999996        0.000133        0.333287        0.666580        0.000000        0.000488        0.999512        0
.000000 0.057580        0.942420        0.000000        0.999872        0.000128        0.000000        0.071134        0.928866        0
.000000 0.001196        0.998804        0.333333        0.333333        0.333333        0.333333        0.333333        0.333333        0
.333333 0.333333        0.333333        0.000000        0.939628        0.060372        0.000000        0.999916        0.000084        0
.000000 0.886143        0.113857        0.012462        0.986751        0.000787        0.000000        0.984275        0.015725        0
.000000 0.111109        0.888891        0.000000        0.993658        0.006342

logs:

==> Input Arguments:
        geno: /scratch/leuven/356/vsc35633/angsd/file2.beagle.gz
        probs: true
        log_scale: false
        n_ind: 100
        n_sites: 14657309
        pos: file.pos (WITHOUT header)
        max_kb_dist (kb): 100
        max_snp_dist: 0
        min_maf: 0.000000
        ignore_miss_data: false
        call_geno: false
        N_thresh: 0.000000
        call_thresh: 0.000000
        rnd_sample: 1.000000
        seed: 1717670231
        extend_out: false
        out: /scratch/leuven/356/vsc35633/angsd/ngsld.out
        n_threads: 36
        verbose: 1
        version: 1.2.1 (Jun  5 2024 @ 11:46:45)

==> GZIP input file (not BINARY)
> Reading data from file...
> Header found! Skipping line...
==> Calculating MAF for all sites...
==> Getting sites coordinates
> Header found! Skipping line...
> Header found! Skipping line...
> Header found! Skipping line...
> Header found! Skipping line...
> Header found! Skipping line...
> Header found! Skipping line...
> Header found! Skipping line..

update: there is something wrong with the contents of the .pos file. I wil try to fix it and give another update

Solved.
For whoever might run in the same problem:
The pos file can be made by extracting the first column from the beagle file without header and then changing the '_' between the contig/chrom ID and the marker position.
Just extracting the first two columns of the beagle file won't work because the second column is the allele, not the marker position.

I should have checked my files better, sorry for bothering.

Hi I am having this same issue after going through the steps to adjust any delimiter issues. I have tried both genotype formats (.geno.gz from angsd) and probabilities (.beagle.gz). I have tried pos files from both with a tab and space delimiter. What is the correct delimiter to use and can anyone share the code they used to create the pos file?

Thanks!

The delimiter has so to be a tab (/t).
You can split the first column of the beagle file on underscores (easy if your contig/chr names do not have any).

Hi I've tried with a tab delimiter. Ive tried with both the beagle and the maf files. Is there anything else that has come up in the past?

can you paste here the position file you are using and the messages you are getting?

Hi my pos file looks like this:

scaffold1       442
scaffold1       458
scaffold1       816
scaffold1       817
scaffold1       821
scaffold1       823
scaffold1       827
scaffold1       834
scaffold1       842
scaffold1       847

My initial output looks like this:

==> Input Arguments:
        geno: wild_cluster.list-am_GATK_238_0.05_plink.beagle.gz
        probs: true
        log_scale: false
        n_ind: 26
        n_sites: 11179722
        pos: pos (WITHOUT header)
        max_kb_dist (kb): 100
        max_snp_dist: 0
        min_maf: 0.000000
        ignore_miss_data: false
        call_geno: false
        N_thresh: 0.000000
        call_thresh: 0.000000
        rnd_sample: 1.000000
        seed: 1720282292
        extend_out: false
        out: wild_fulleri_global.LD
        n_threads: 1
        verbose: 1
        version: 1.2.1 (Jun 21 2024 @ 10:41:09)

==> GZIP input file (not BINARY)
> Reading data from file...

And then I get the warning:

 Reading data from file...
> Header found! Skipping line...

Which continues for each line.

How many lines in the pos and beagle files?
How many columns in the beagle file?

The beagle has 81 columns, with 25 indiviudals.

The beagle has 11179723 rows and the pos file has 11179722

on you ngsLD run you used --n_ind 26.

Sorry there are 26 individuals - the first is encoded as Ind0 so I underestimated.

Can you send me a small example (of both beagle and pos) so I can reproduce the error?

test.beagle.gz
test.pos.txt

Thanks for all the help so far!

Your files have different number of positions (1000 vs 3000) but, if you re-generate the pos file:

zcat test.beagle.gz | cut -f 1 | tail -n +2 | sed 's/_/\t/' > test.pos.txt

and run:

ngsLD --n_threads 10 --geno test.beagle.gz --probs --n_ind 26 --n_sites 2999 --pos test.pos.txt > /dev/null

it runs fine. Can you send me the command you are using?