[docdb][ysql] Reuse iterators during txn conflict resolution

Question

[docdb][ysql] Reuse iterators during txn conflict resolution

bmatican opened this issue 5 years ago · 0 comments

Did some recent tests with @d-uspenskiy 's upcoming ysql-side batching optimization, using:

java -jar yb-sample-apps-dl-batch.jar --workload SqlDataLoad --nodes $NODES --num_threads_write 27 --num_value_columns 128 --num_foreign_keys 1 --num_indexes 2 --batch_size 100 --num_unique_keys 100000000

This workload is doing batching of 100 inserts at a time, at the YSQL layer, but also in a schema with 128 columns.

When doing a perf run, observed the following

While the perf tree is expanded quite a bit, I assume the problem is likely that we're spending that much time creating rocksdb iterators in the first place.

Doing some code digging, @ndeodhar pointed out that we have this aptly described TODO in src/yb/docdb/conflict_resolution.cc:

      // TODO(dtxn) reuse iterator
      auto value_iter = CreateRocksDBIterator(
          resolver->doc_db().regular,
          resolver->doc_db().key_bounds,
          BloomFilterMode::USE_BLOOM_FILTER,
          key_slice,
          rocksdb::kDefaultQueryId);

This seems like it is being executed over every row in the batch and over every column in the row.

cc @spolitov @mbautin