nathan-russell/hashmap

Equivalent of object.size() for hashmap?

coloneltriq opened this issue · 2 comments

Is there a way to determine how much RAM a hashmap object is taking up in memory, similar to R's object.size() function? The size() method gives the number of key-value pairs, but not the memory footprint, and object.size() doesn't seem to capture all the memory allocated (perhaps missing C++ structures?).

And semi-relatedly, if you delete all the key-values from the hashmap using the clear() method, does the memory previously needed get released back to the system?

Thanks!

Is there a way to determine how much RAM a hashmap object is taking up in memory, similar to R's object.size() function?

Hash tables are typically implemented as fairly sophisticated data structures which contain not only the keys and values inserted by the client, but additional pointers, intermediate structures, etc. For example, see this discussion of the subject. Much of the internals exist as private data members, and since boost::unordered_map does not expose an interface for obtaining the necessary information, I don't see any way that this can be determined precisely. However, I believe there are ways of obtaining decent approximations, so I will look into this when I have a chance.

and object.size() doesn't seem to capture all the memory allocated (perhaps missing C++ structures?)

Yes, object.size is implemented in C for specific SEXP types. The hashmap itself is an S4 object (S4SXP) storing various components. Most notably, the data you insert is held in a C++ structure, to which the R-level hashmap only holds a pointer to (the .pointer member, an EXTPTRSXP), and it has no way of determining the size of whatever is stored at that memory address. For example,

library(hashmap)
x <- hashmap("a", 1)
y <- hashmap(1e4, 1e4)

object.size(x)
# 648 bytes

object.size(y)
# 648 bytes

if you delete all the key-values from the hashmap using the clear() method, does the memory previously needed get released back to the system?

It's hard to say because AFAIK this isn't specified in the C++ standard, so it will likely depend on the implementation of the memory allocator for your specific compiler, and probably the memory allocator used by the OS itself. TBH I would not worry about it though. It's safe to assume that whoever programmed your OS know much more about memory allocation than you or I, so most likely whatever happens is the correct behavior.

OK. Thanks for the explanation!