lsm unit/fuzz testing

Question

lsm unit/fuzz testing

jamii opened this issue 2 years ago · 0 comments

binary_search
bloom_filter
- no false negatives
- test false positive rate (requires calculating the expected rate, which is hard without poisson distribution in stdlib)
k_way_merge
segmented_array
set_associative_cache
node_pool
superblock (done in #224)
superblock manifest encode/decode
superblock free set (done)
superblock client table encode/decode
- (hold off on testing this for now; client table is going to be restructured as part of #212 - @sentientwaffle)
grid (depends on superblock, fifo, iops, set_associative_cache)
manifest (depends on grid, manifest_log, manifest_level, node_pool)
manifest level (depends on segmented_array, node_pool)
manifest log (done in #224)
table (depends on grid (BlockType), manifest (TableInfoType))
table_immutable (depends on binary_search, (table?))
table_mutable (depends on set_associative_cache, (table?))
tree (depends on binary_search, bloom_filter, node_pool, ring_buffer, eytzinger, grid, superblock)
- get, put, remove, compact
- checkpoint
- range queries
- restarts
- storage faults
- concurrent operations
multiple trees, checking that storage is deterministic over differences in IO latency and crash/recover-from-checkpoint (maybe use StorageChecker like the VOPR)
groove (depends on table, tree, grid, composite_key, node_pool)
- Run two grooves in parallel (separate Storage, etc). Run the same operations against both. Since the storage latencies are different, IO is reordered — but at each checkpoint (and possibly after each compaction beat) the two "disks" should be byte-for-byte identical.
- Also check that the superblock manifest & free set are identical.
- This same "parallel" test should be applied to the Forest as well.
posted_groove (depends on table, tree, grid, node_pool)
level_iterator (depends on table_iterator, rung_buffer, manifest, grid)
table_iterator (depends on ring_buffer, manifest, grid)
compaction (depends on grid, manifest, k_way_merge, table_iterator, level_iterator)
forest (depends on grid, node_pool, compaction)
- get, put, compact
- remove
- checkpoint
- secondary index get
- range queries
- restarts
- storage faults
- concurrent operations
- verify storage determinism (see "groove" for explanation)
calculate upper bound on storage size for given workload, test that we don't exceed this
state machine (and indirectly, the forest). Use the VOPR's Workload+auditor, but attached directly to the state machine instead of a VSR client.
journal (this isn't LSM, but still important especially around wrapping). @sentientwaffle
add a flag to the global allocator that prevents allocation after init

Once we have good fuzz coverage and have fixed all the quickly-found bugs, we'll probably want some vopr-like infrastructure to fuzz in the background and post issues too.