juliema/aTRAM

sqlite3.OperationalError: disk I/O error

Closed this issue · 10 comments

I've been trying to get the atram_preprocessor.py to run on a linux cluster account, but I keep getting this sqlite error: sqlite3.OperationalError: disk I/O error.

I see that this error message was addressed in #216, but I haven't had any luck with the solutions you listed there. Is there anything elseI can try?

(aTRAM) [06:11:51@login1]:$ ./atram_preprocessor.py --blast-db=./DATABASE --end-1=doc/data/tutorial_end_1.fasta.gz --end-2=doc/data/tutorial_end_2.fasta.gz --gzip
2020-10-02 18:12:53 INFO : ################################################################################
2020-10-02 18:12:53 INFO : aTRAM version: v2.3.3
2020-10-02 18:12:53 INFO : Python version: 3.8.5 (default, Sep 4 2020, 07:30:14) [GCC 7.3.0]
2020-10-02 18:12:53 INFO : ./atram_preprocessor.py --blast-db=./DATABASE --end-1=doc/data/tutorial_end_1.fasta.gz --end-2=doc/data/tutorial_end_2.fasta.gz --gzip
Traceback (most recent call last):
File "./atram_preprocessor.py", line 175, in
preprocess(ARGS)
File "/scratch/devel/jprograms/aTRAM/lib/core_preprocessor.py", line 30, in preprocess
with db.connect(args['blast_db'], clean=True) as cxn:
File "/scratch/devel/programs/aTRAM/lib/db.py", line 33, in connect
return db_setup(db_name)
File "/scratch/devel/programs/aTRAM/lib/db.py", line 69, in db_setup
cxn.execute("PRAGMA journal_mode = WAL")
sqlite3.OperationalError: disk I/O error

Question: Do you have a copy of the log file?

Yes, but it doesn't go beyond the first four lines.

2020-10-02 19:26:50 INFO : ################################################################################
2020-10-02 19:26:50 INFO : aTRAM version: v2.3.3
2020-10-02 19:26:50 INFO : Python version: 3.7.5 (default, Apr 15 2020, 17:02:31) [GCC 6.3.0]
2020-10-02 19:26:50 INFO : ./atram_preprocessor.py --blast-db=./DATABASE --end-1=doc/data/tutorial_end_1.fasta.gz --end-2=doc/data/tutorial_end_2.fasta.gz --gzip -l TestLog

That was helpful, thanks.

It's probably an environment issue.

It looks like one process is holding the database journal file open when another process tries to open it. Given that you're in single threaded code at this point (& the tute never has more than 1 background process), there seems to be a process switch happening on the cluster when the DB is being opened.

I don't want to waste your time by trying a lot of things but I have one thing that may work for you. I'm trying a patch for this.

If you could cd into the aTRAM directory and then do a git pull. I made a small patch that may help.

Thanks! I did a git pull, but it said "Already up-to-date"

Mea culpa. I forgot to push. Please try again.

the updated loaded, but unfortunately that didn't do the trick:

2020-10-02 21:07:48 INFO : ################################################################################
2020-10-02 21:07:48 INFO : aTRAM version: v2.3.3
2020-10-02 21:07:48 INFO : Python version: 3.7.5 (default, Apr 15 2020, 17:02:31) [GCC 6.3.0]
2020-10-02 21:07:48 INFO : ./atram_preprocessor.py --blast-db=./DATABASE --end-1=doc/data/tutorial_end_1.fasta.gz --end-2=doc/data/tutorial_end_2.fasta.gz --gzip -l TestLog

Let me ask people who use aTRAM in clusters. I am sorry for the trouble you're having.

Thanks so much, I appreciate it.

After some discussion with my sysadmin, it looks like a disk issue after all. I have it working now. Sorry for the trouble!

Thanks for letting us know.