dpwe/audfprint

Reduce memory usage

Closed this issue · 2 comments

Hello
Thank you so much for audfprint, your implementation of audio recognition works much better and more accurately than others on Python. I experimented with different settings to achieve an excellent result, and to some extent I managed to find a middle ground, but noticed a very serious problem that others have no implementation - this is a huge use of memory, even with standard settings. From the very beginning I thought that the problem in gzip while saving and opening saved hashes, but as it turned out we really had a problem in pickle, I decided to replace it with HDF5 similar storage using the hickle library (which has an interfeis similar to pickle), but the problem still did not solve it. The use of memory in general when analyzing even 1mb file is approximately 1.5gb, which is very critical (I have Linux with 2.0 GB, Celeron)

What are your thoughts? Could it be better to use MySQL as in this project?

dpwe commented

I'm sorry I haven't answered for so long.

Do you mean RAM or Disk?

I meant RAM. The files themselves in disk principle inside gzip weigh 1024 times less. I experimented with different methods of saving audio hashes table to a file, and did not achieve much result. The more audio hashes, the more audfprint takes memory during the recognizing. After looking at the code, I came to the conclusion that you need to transfer the mechanism for reading and saving audio hashes table. Now, alas, I'm busy finding time to try to implement what I'm thinking.