facebook/rocksdb

doubt about CompactFilter in C API

979357361 opened this issue · 3 comments

Hi everyone seeing this,

I use RocksDB to save metadata of virtual disk, there are thousands of vdisk sharing one column family, bacause running thousands of CF is not a good idea. Every vdisk has its own checkpoint pace, if power failure happened, I need to rollback all vdisks to their old CP point. Because the completion time of CP for each vdisk is inconsistent, therefore, it is not appropriate to roll back the entire DB to a certain point.

The only way I find to do this is to use CompactFilter to filter out all the KV, meanwhile, the CP number of a KV is coded in the value, so CompactFilter can check the value to get the CP number and to decide whether to keep or drop it.

However, when I refer to the CompactFilter notes, it says "The table file creation process invokes this method before adding a kv to the table file.", and the paramater of Filter only contain key and value slice. So there is my question: whether other KV type will be passed to the Filter? like deletion, and if Yes, what is the value (slice) of deleted KV be like?
virtual bool Filter(int /*level*/, const Slice& /*key*/, const Slice& /*existing_value*/, std::string* /*new_value*/, bool* /*value_changed*/)

By the way, I use RocksDB via C API, and there is only the filter API, FilterV2/3 is not easy to migrate to C interface. So my another doubt is that: what is the deletion KV be like (value and value size) in Filter C API?
bool Filter(int level, const Slice& key, const Slice& existing_value, std::string* new_value, bool* value_changed) const override { char* c_new_value = nullptr; size_t new_value_length = 0; unsigned char c_value_changed = 0; unsigned char result = (*filter_)(state_, level, key.data(), key.size(), existing_value.data(), existing_value.size(), &c_new_value, &new_value_length, &c_value_changed); if (c_value_changed) { new_value->assign(c_new_value, new_value_length); *value_changed = true; } return result; }

@979357361 Looking at the wiki here - https://github.com/facebook/rocksdb/wiki/Compaction-Filter

If there are multiple versions of the same key from the input of the compaction, compaction filter will only be invoked once for the newest version. If the newest version is a deletion marker, compaction filter will not be invoked. However, it is possible the compaction filter is invoked on a deleted key, if the deletion marker isn't included in the input of the compaction.

@adamretter thank you for your time
so if I do a manual compaction for the entire DB, all data will be compacted to the lowest level, only the newest version of a key will be kept and pass the filter, old version and deleted key will be wiped out and not pass the filter?

@979357361 I haven't tested this, but the Wiki seems to say that that could be the case. I suggest you give it a test, and let us know if we need to update the Wiki.