EESI/quikr

Too many open files

haluk opened this issue · 4 comments

Hi, I'm using rdp7_trainset_112011.fa and I have 538 fasta files. I added their names into fasta_files.txt file and some of its contents are like in the following:

  • PC.NDU0UmVhZHMuMTA0XzEuZmE.fa
  • PC.NDU0UmVhZHMuMTA0XzIuZmE.fa
  • PC.NDU0UmVhZHMuMTA0XzMuZmE.fa

I use the following multifasta_to_otu command:

multifasta_to_otu -f fasta_files.txt -s data/rdp7_sensing_trainset_112011_matrix.gz -o project.table -v

However, after 246 files are processed I got this error:

processing PC.NDU0UmVhZHMuSFNXMDAwMDQ3XzEwMDI1Ml9CaW9wX0NvbF9SZWN0LmZh.fa
could not open "PC.NDU0UmVhZHMuSFNXMDAwMDQ3XzEwMDI1Ml9CaW9wX0NvbF9SZWN0LmZh.fa"
PC.NDU0UmVhZHMuSFNXMDAwMDQ3XzEwMDI1Ml9CaW9wX0NvbF9SZWN0LmZh.fa has 0 sequences
Error opening PC.NDU0UmVhZHMuSFNXMDAwMDQ3XzEwMDI1Ml9CaW9wX0NvbF9SZWN0LmZh.fa - Too many open files

Then, I reduced my data set into 100 files but this time I got this error:

PC.NDU0UmVhZHMuMTcxX0VfVFZDX0gxNzY1LmZh.fa has 3658 sequences
there are 4096 values less than 4188
55/100 samples processed
processing
could not open ""
has 0 sequences
Error opening - No such file or directory

Also, if I don't run the command in verbose mode, it doesn't go that further.

I have MacBook Pro 2.3 GHz Intel Core i7, 8 GB 1600 MHz DDR3.

Do you have any idea what the problem is about?

Thanks.

Hi,

I've figured out that this bug was simply us forgetting to close open file handles. I have pushed a fix. If you clone the latest quikr, this problem should be resolved. Please let me know if it works or doesnt.

I clone the the repo, and then compile it again, but it's still the same thing. It could only process 246 files. Are you sure that you push the fixed code?

whoops pushed it to another repo :-)

try it now!