gymrek-lab/TRTools

Numpy Error with dumpSTR

Closed this issue · 3 comments

Hi,
Trying to run dumpSTR on a gangSTR output with this command
dumpSTR --vcf MY.vcf --gangstr-max-call-DP 1000 --gangstr-filter-spanbound-only --gangstr-filter-badCI --gangstr-expansion-prob-het 0.8 --drop-filtered --out dumpSTR.out
And get the following error
Traceback (most recent call last):
File "/opt/miniconda3/bin/dumpSTR", line 10, in
sys.exit(run())
File "/opt/miniconda3/lib/python3.7/site-packages/trtools/dumpSTR/dumpSTR.py", line 1245, in run
retcode = main(args)
File "/opt/miniconda3/lib/python3.7/site-packages/trtools/dumpSTR/dumpSTR.py", line 1183, in main
record = ApplyCallFilters(record, call_filters, sample_info, invcf.samples)
File "/opt/miniconda3/lib/python3.7/site-packages/trtools/dumpSTR/dumpSTR.py", line 599, in ApplyCallFilters
record.vcfrecord.set_format('FILTER', np.char.encode(all_filter_text))
File "cyvcf2/cyvcf2.pyx", line 1390, in cyvcf2.cyvcf2.Variant.set_format
Exception: ('format: currently only float and int numpy arrays are supported. got %s', dtype('S33'))
Any help would be great thanks ali

nmmsv commented

Hello,
Thanks for reporting this issue. I'm CCing Jonathan who has more experience with cyvcf2 and has a better insight on what may be going on.
Best,
Nima

Hi there,

What version of cyvcf2 are you using? Please run python -c 'import cyvcf2; print(cyvcf2.__version__)'

TRTools requires cyvcf2 >= 0.30.1, perhaps that's the problem?

Also, if it doesn't contain sensitive information, can you share your VCF file with us? I can't reproduce the issue when running your command on my machine with the VCF example-files/trio_chr21_gangstr.sorted.vcf.gz from our github