oddt/rfscorevs_binary

system irresponsible /tmp/ folder

diegoenry opened this issue · 2 comments

When running rfscore_vs with multiple CPUs with a large compound libraries, the system becomes irresponsible due to heavy load.

I noticed it writes temporary files to /tmp and well a large library implicates lots of writes ~227 Mb "compiledtrees_XXXXX.so" one for each core.

This process slows down system performance over time, even when /tmp is mounted on a SSDs. It is more pronounced when non-PRO SSDs are used after it's "fast-cache" fills ups.

Frequent "large" writes can degrade SSDs, as they have a limited number of write cycles to each NAMD cell.

I think it would be beneficial to write temporary files to RAM (/dev/shm/) as a way to speed things up systematically or at least give this option to the user.

Hi @diegoenry! Thanks for your report. sklearn-compiledtrees is the module responsible for writes to /tmp folder and uses Pythons temp.NamedTemporaryFile to crete files. This defaults as you've noticed to /tmp directory. On many system (clusters in particular) /tmp is mounted on the ram disk, which I highly recommend. The solution proposed by you (/dev/shm) would also work only for Linux, and we still support Windows and MacOS.

That said if you spawn enough threads you may run out of memory and observe symptoms that you report - unresponsive system. Could you make sure that you have enough memory?

Also worth checking would be to lower the number to threads to the physical core count instead of threads (with HT enabled you have 2x threads). You will not loose any performance, and the memory usage will drop two-fold.

I will take a self note to try make sklearn-compiledtrees to use just one file, although that could be tricky, as you would need to keep track on how many of live copies you still have of the model in multiple processes.

Thanks for the anwer.
I double checked, the RAM (32Gb), it is not saturated by all 12 threads, reducing to 6 threads to use only physical cores improved system reliability. "iostat" however showed why system topping my SSD avaible IOPS.

I must confess didn't double check this, but I when running multithreaded, I have the impression the tmp files were written constantly.

Mounting /tmp to RAM helped tremendously, thank you for the tip.