64-bit values for 32-bit RAM_DOMAIN?
ruricolist opened this issue · 4 comments
Currently a Souffle program handling 64-bit values has to use a RAM_DOMAIN_SIZE=64, which means using 64 bits for everything. If the majority of values are 32 bit, and 32 bit suffices for indexing, a 64-bit domain wastes memory.
We have been discussing internally whether it would be workable to use RAM_DOMAIN_SIZE=32 but add explicit 64-bit integer types to Souffle. Is this something you would be interested in incorporating?
Would that effort support arbitrary value sizes? I'm thinking about opaque values from functors (C/C++ structs), or inlined records / tuples for instance.
We were thinking more along the lines of adding new types that could be used in Datalog (like a number64
or unsigned64
type) where 64 bits are specifically required. For a 64-bit domain they would be equivalent to number/unsigned, but for a 32-bit domain they would get special handling.
I can foresee at least one difficulty with the interpreter engine that is currently not design to support more data types efficiently. The topic is discussed in the vicinity of this comment: #2311 (comment)
Yes - indeed. The interpreter is the main stumble block. The idea here is to de-specialize data structures as much as possible. For tuples with arbitrary element lengths, we would need specialized/interpreted comparators. This would be too slow in practice.
An alternative design for the interpreter is to have several sub-indices representing one index. The sub-indices contain elements of equal length in the interpreter (e.g. 32 bit elements, 64 bit elements etc). But there might be other ideas to avoid specialized/interpreted comparators.
In addition, there are other hurdles, such as re-implementing a new typesystem (which would need to be refactored anyway).