ostafen/clover

Gob encoding and internal data types

ostafen opened this issue · 0 comments

Discussed in #41

Originally posted by ostafen May 1, 2022
Hi, everyone, I created this discussion to collect opinions and suggestions, since this is a very sensitive topic.
Currently, CloverDB serializes documents to json before storing them on disk. This has been done because of the fact that, early versions of the library used ".json" files directly to store data.
But since Clover evolved since that time (it now uses the badger kv-store), this solution is no more acceptable for the following reasons:

  • instances of the time.Time struct cannot be correctly recovered, because they are converted to string when serialized and, as a consequence, json.Unmarshal() deserializes them to normal stings. This affects queries involving dates or times (unless you decide to store them as a timestamp during document insertion).
  • All numbers are silently converted to float64.

To fix these issues, I was thinking to switch to the gob encoding, which preserves the correct type for each document field.
This open a new question about internal data types:

Which numeric types should be supported by clover? Should we preserve all of the types (int, uint8/int8, uint16/int16... and so on) or should we restrict types (using int64 for integer numbers and float64 for double numbers, for example).

What do you think about this?