tikv/agatedb

Feature Request: separate filter&index blocks and data blocks into two dedicated files, and SST supports range filter

zhangjinpeng87 opened this issue · 0 comments

When scan a range[start, end), LSM-tree based engine will create a iterator for each level and seek to the proper position for the corresponding SST file(whose range[smallest, biggest) cover [start, end) ). This will open the SST file and read some content of this SST even there is no key covered by range[start, end), for example the SST contains keys [a, e, f], and the scan range is [b, c).

This will open and read is not necessary if we can know this SST doesn't contain keys in the scan range. https://www.cs.cmu.edu/~pavlo/papers/mod601-zhangA-hm.pdf described a type named succinct range filter, which can achieve such goal. This can benefit even more when these SST files are stored in remote storage like cloud disk or distributed file system on the cloud.

Let's using the AWS EFS as an example, EFS has active and inactive tiers, it will move these inactive files to the inactive tier automatically to reduce the total storage cost. When the files in inactive tier are requested, EFS will move them to the active tier with some throughput fee. If we want to leverage such fundamental cloud storage, we need to consider how to make good use of its inactive tier to reduce the total storage cost, at the same time, we need to consider how to reduce the data transfer between its inactive tier and active tier.

If we separate the fitler&index block and the data blocks into 2 dedicated files, The index&filter files maybe accessed frequently, and they probably stay in the active tier of EFS, these long-time un-touched data files will stay in the inactive tier of EFS. This works well when the workload is pure point get, because bloom-filter will reduce these unnecessary touching of data files, but when the workload contains range queries, range filter make sense rather than bloom-filter.