sqlite3.OperationalError: no such table: main.variants
huanyaogao opened this issue · 2 comments
I've been trying to run my first test set of data using this tool, but an error showed and no 1000 genome data was downloaded. Here is the code I run on Bash in a Linux system, where [$path] is my real path on the server.
#Establish conda environment and export path
conda create -y --prefix [$path]/miniconda3/envs/ldtools python=3.7
conda activate ldtools
conda install -y -c bioconda pysam=0.15.4
conda install -y -c conda-forge tabulate
conda install -y -c conda-forge plotly
conda install -y -c anaconda numpy
export PATH=[$path]/miniconda3/envs/ldtools/:[$path]/miniconda3/envs/ldtools/bin:[$path]/miniconda3/envs/ldtools/lib/python3.7/site-packages:$PATH
alias python='[$path]/miniconda3/envs/ldtools/bin/python3.7'
#My code to run the ld-tools:
INPUT_DIR=[$path]
G1000Genome=[$path]/Refdata/1000genome/
ldtools=[$path]/ldtools/ld-tools-master/
python $ldtools/ld_triangle.py -S $INPUT_DIR/ -D $G1000Genome -t $INPUT_DIR/ -m 1 -e all -z 0.8 -o table
#My small dataset under $INPUT_DIR/try_data.tsv look like this:
rsID chr Range
rs2252865 chr1 1
rs880315 chr1 1
rs3748817 chr1 1
rs3007421 chr1 1
rs760816 chr1 1
rs2273291 chr1 1
rs17367504 chr1 1
rs709209 chr1 1
rs6685497 chr1 1
rs2843152 chr1 1
And the output looks like this:
samples.txt... OK
conversion.db... OK
samples... OK
urls.txt... OK
id... Traceback (most recent call last):
File "/research/labs/pharmacology/lwrwpharm/m176113/tools/ldtools/ld-tools-master//ld_triangle.py", line 390, in
prep_single_proc = PrepSingleProc(args)
File "/research/labs/pharmacology/lwrwpharm/m176113/tools/ldtools/ld-tools-master//ld_triangle.py", line 31, in init
self.intgen_convdb_path = prep_intgen_data(self.intgen_dir_path)
File "/research/labs/pharmacology/lwrwpharm/m176113/tools/ldtools/ld-tools-master/backend/prep_intgen_data.py", line 183, in prep_intgen_data
cursor.execute('CREATE INDEX IF NOT EXISTS "id" ON variants (ID)')
sqlite3.OperationalError: no such table: main.variants
I could not figure out what is the problem here. Could you please help me with that? Thank you!
@huanyaogao I have known about this bug for a long time, but due to the low popularity of ld-tools project I did not describe this bug. To fix it, I need to rewrite a significant part of the code and, possibly, abandon the automatic 1000 Genomes data downloading and preparation. I don't have that opportunity right now, sadly.
Thanks so much for your response. Would you please then suggest an alternative solution? For example, I could download the reference directly from https://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/ and if you would let me know what exactly are the files to be downloaded, how to do process it (liftover I assume?) so that I can run the following code to calculate the LD, please. Thank you!