odelaneau/shapeit5

file extention not recognized

Opened this issue · 8 comments

Hi,
I am trying to use shapeit5 for the first time. I tested the "test" script and it works.
I am getting this error for the output file
ESC[31mERROR: ESC[0mFilename extension of [/phased_chr/phased_cd4_aging_chr1.vcf.gz] not recognized

my script is the following
`
SHAPEIT5=~/packages/shapeit5/phase_common_static

for i in {1..22}; do
MAP=~/packages/shapeit5/shapeit5/resources/maps/b38/chr${i}.b38.gmap.gz

INPUT=~/aging_project/scRNAseq/resources/cd4_allsamples_vcf_perchr/cd4_allsamples.chr${i}.tagged.vcf.gz

OUT=/phased_chr/phased_cd4_aging_chr${i}.vcf.gz

$SHAPEIT5 --input $INPUT
--map $MAP
--region chr${i}
--output $OUT
--thread 16
done
`

i tried modifying the script without the variable $OUT and still get the same error.

Hi,
Difficult to say exactly where the problem is here. But vcf.gz files are read and written by shapeit5. I'd suggest to try with a single chromosome in an interactive shell, it's likely that the options you pass to the program as somewhat wrong.

CNuge commented

Hello,

I wanted to mention that I too am encountering this same error when running SHAPEIT5 for single chromosomes at a time as suggested, from the interactive shell. Initially I thought I was in error, using the .gz extension, and repeated the test specifying the output as .vcf format instead, but this did not solve the problem.

The program appears to be running correctly (does not fail on initiation) and only fails on the final state (follow completion of the entire series of MCMC iterations and finalization.

original command tested:
SHAPEIT5_phase_common --region chr21 -I resources/tagged_thousand_genomes/chr21.vcf.gz -O resources/phased_thousand_genomes/phased_chr21.vcf.gz

revised to remove the .gz extension:
SHAPEIT5_phase_common --region chr21 -I resources/tagged_thousand_genomes/chr21.vcf.gz -O resources/phased_thousand_genomes/phased_chr21.vcf

run using a conda install of shapeit5 with the following:

channels:
  - bioconda
  - conda-forge
dependencies:
  - shapeit5

Hi all,

I'm experiencing the same. The code is

cd /media/pato/KINGSTON/temp;
docker run -v $(pwd):/media/pato/KINGSTON/temp -w /media/pato/KINGSTON/temp shapeit5_2023-05-05_d6ce1e2 phase_common_static
--input jose_merged_imputation_AC.vcf.gz
--map chr22.b37.gmap.gz
--region 22
--output phased_chr22.vcf.gz
--log phased_chr22.log

At the end of the terminal output:
##############
(many rows)

Finalization:

  • HAP solving (0.09s)
  • HAP update (0.03s)
  • H2V transpose (0.01s)

ERROR: Filename extension of [phased_chr22.vcf.gz] not recognized

##############

I'd tried unsuccessfully using .vcf and no extension.

I would appreciate your help with this issue.

Thanks,
Patricio

Same issue. The only extension it would accept for me is .bcf

Same issue here, using shapeit5 installed from conda

Same issue, do we have a solution here?

Only solution I have found is to output to bcf, and then if you require a different format do any necessary conversions afterwards.

There is a clue in 5.1.1, if you set
--output-format vcf.gz
phase_common will provide the error message
ERROR: Output format[vcf.gz] unsupported, use [graph, bcf or bh] instead
This suggests, indeed, that the bcf workaround suggested by @CNuge may be the supported approach to getting compressed vcf files.