alek0991/iSAFE

EmptyDataError while load_vcf_as_df

L-of-IOS opened this issue · 2 comments

I was trying to run iSAFE on a SNPEff annoted vcf, and get such error. Can you tell me where might go wrong?

Reference (REF) allele is considered as ancestral allele and alternative allele (ALT) is considered as derived allele. Strongly recommend to provide an ancestral allele file if it is available.
warnings.warn("Ancestral allele file (--AA) is not specified. Reference (REF) allele is considered as ancestral allele and alternative allele (ALT) is considered as derived allele. Strongly recommend to provide an ancestral allele file if it is available.")
Traceback (most recent call last):
File "/home/yinglu/.local/lib/python3.8/site-packages/isafe/bcftools.py", line 58, in load_vcf_as_df
df = pd.read_csv(StringIO(os.popen(cmd).read()), sep='\t', header=None)
File "/home/yinglu/.local/lib/python3.8/site-packages/pandas/io/parsers.py", line 610, in read_csv
return _read(filepath_or_buffer, kwds)
File "/home/yinglu/.local/lib/python3.8/site-packages/pandas/io/parsers.py", line 468, in _read
return parser.read(nrows)
File "/home/yinglu/.local/lib/python3.8/site-packages/pandas/io/parsers.py", line 1057, in read
index, columns, col_dict = self._engine.read(nrows)
File "/home/yinglu/.local/lib/python3.8/site-packages/pandas/io/parsers.py", line 2061, in read
data = self._reader.read(nrows)
File "pandas/_libs/parsers.pyx", line 756, in pandas._libs.parsers.TextReader.read
File "pandas/_libs/parsers.pyx", line 771, in pandas._libs.parsers.TextReader._read_low_memory
File "pandas/_libs/parsers.pyx", line 827, in pandas._libs.parsers.TextReader._read_rows
File "pandas/_libs/parsers.pyx", line 814, in pandas._libs.parsers.TextReader._tokenize_rows
File "pandas/_libs/parsers.pyx", line 1951, in pandas._libs.parsers.raise_parser_error
pandas.errors.ParserError: Error tokenizing data. C error: Expected 43 fields in line 9, saw 53

The number of fields varies as sample size changes.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/yinglu/.local/bin/isafe", line 11, in
load_entry_point('isafe==1.1.1', 'console_scripts', 'isafe')()
File "/home/yinglu/.local/lib/python3.8/site-packages/isafe/isafe.py", line 172, in run
df, dfreq, dfI, Need_Random_Sample = get_snp_matrix(chrom, region_start, region_end, args.input, args.AA,
File "/home/yinglu/.local/lib/python3.8/site-packages/isafe/bcftools.py", line 169, in get_snp_matrix
df = get_combined_vcf(chrom, region_start, region_end, case_vcf, cont_vcf=cont_vcf, case_IDs=case_IDs, cont_IDs=cont_IDs)
File "/home/yinglu/.local/lib/python3.8/site-packages/isafe/bcftools.py", line 83, in get_combined_vcf
df, sample_IDs = load_vcf_as_df(case_vcf, chrom, region_start, region_end, samples=samples)
File "/home/yinglu/.local/lib/python3.8/site-packages/isafe/bcftools.py", line 59, in load_vcf_as_df
except pd.io.common.EmptyDataError:
AttributeError: module 'pandas.io.common' has no attribute 'EmptyDataError'

Could you please send me a vcf file with this errors? The first error should be related to the vcf format and the second one is probably related to your python environment. In case you need my email: ali_akbari@hms.harvard.edu

This error were caused by using un-phased vcf as input or lack of genotype imputation. By using beagle 5.2 and convert vcf to .hap file. This error was avoided.