algbio/themisto

Themisto doesn't work with any input data (Error: Cannot open temporary file tmp/kmc_00250.bin)

karel-brinda opened this issue · 2 comments

With every input FASTA file I'm getting the following error:

$ /Users/karel/github/themisto/build/bin/build_index --k 31 --input-file small.fasta --index-dir index --temp-dir tmp
0.0250 Mon Sep 14 15:05:53 2020 Themisto-v0.2.0-1-gd8e44f5
Input file = small.fasta
Input format = fasta
Index directory = index
Temporary directory = tmp
k = 31
Number of threads = 1
Memory megabytes = 1000
Automatic colors = false
Load BOSS = false
0.0260 Mon Sep 14 15:05:53 2020 Starting
0.0260 Mon Sep 14 15:05:53 2020 Making all characters upper case and replacing non-{A,C,G,T} characters with random characeters from {A,C,G,T}
0.0260 Mon Sep 14 15:05:53 2020 Replaced 0 characters
0.0270 Mon Sep 14 15:05:53 2020 Building BOSS
0.0270 Mon Sep 14 15:05:53 2020 Listing (k+2)-mers
Calling KMC with: kmc -fm -k33 -b -m1 -ci1 -cs1 -cx4294967295 -t1 tmp/seqs-p0cfNR6pGewk8dL4ndt8NusCT tmp/KMCkqL5Ep8uXIfD7yGaYdY16rptL tmp 
**
Error: Cannot open temporary file tmp/kmc_00250.bin

Hi, couple of really simple debug questions:

  • Does the directory "tmp" exist? Themisto cannot currently create the directory by itself, so the directory has to exist before running. You'll also need to manually create the "index" directory to write the output files in.
  • Is the directory "tmp" accessible (are the permissions correct so that the user who ran themisto can read and write in the directory)?

Also, what OS are you running themisto on? macOS for example has a relatively low default limit on the number of files that can be concurrently open that conflicts with themisto's use of many temporary files to build the index. To run themisto on macOS you'll need to change the limit by running the following command

ulimit -n 2048

before running themisto. Based on your error message Error: Cannot open temporary file tmp/kmc_00250.bin, I'd guess this conflict is the problem since the default limit is 256 and the index construction stops at 250 temporary files. I suppose some Linux distributions might also have the same issue, in this case you should be able to check the limit by running the "ulimit -n" command, and subsequently change it if necessary by running the same command again as above.

Thanks! ulimit -n 2048 solved the issue.