WojciechMula/pyahocorasick

How to calculate automaton memory footprint?

abcdenis opened this issue · 1 comments

Hi,
is there a way to determine prepared Automaton's memory footprint?
It could be helpful for using this in limited size cache.
Thank you.

@abcdenis the get_stats() method returns a dict https://pyahocorasick.readthedocs.io/en/latest/#other-automaton-methods :

Return a dictionary containing Automaton statistics. Note that the real size occupied by the data structure could be larger because of internal memory fragmentation that can occur in a memory manager.

See https://pyahocorasick.readthedocs.io/en/latest/#get-stats-dict

>>> import ahocorasick
>>> A = ahocorasick.Automaton()
>>> A.add_word("he", None)
True
>>> A.add_word("her", None)
True
>>> A.add_word("hers", None)
True
>>> A.get_stats()
{'nodes_count': 5, 'words_count': 3, 'longest_word': 4, 'links_count': 4, 'sizeof_node': 40, 'total_size': 232}

/hth