KeyMayExist dumps core when value pointer is null
mdcallag opened this issue · 3 comments
Note: Please use Issues only for bug reports. For questions, discussions, feature requests, etc. post to dev group: https://groups.google.com/forum/#!forum/rocksdb or https://www.facebook.com/groups/rocksdb.dev
Expected behavior
I expect no core dump when the value pointer is NULL.
I assume it should be safe to pass NULL for the value pointer in cases where I don't want to get the value. Otherwise, this method has no purpose because if I must fetch the value then I can use Get(). Also, the comments imply that I can pass NULL (see here) and the comment includes this text.
// If the key definitely does not exist in the database, then this method
// returns false, else true. If the caller wants to obtain value when the key
// is found in memory, a bool for 'value_found' must be passed. 'value_found'
Actual behavior
I get a segfault when the value pointer is null with this command line: ./db_bench --benchmarks=fillseq,readrandomfast
After applying this diff:
diff --git a/tools/db_bench_tool.cc b/tools/db_bench_tool.cc
index 713aaaa41..4c00a1e04 100644
--- a/tools/db_bench_tool.cc
+++ b/tools/db_bench_tool.cc
@@ -6034,6 +6034,11 @@ class Benchmark {
ts_ptr = &ts_ret;
}
auto status = db->Get(options, key, &value, ts_ptr);
+ {
+ bool b;
+ // auto status2 = db->KeyMayExist(options, key, &value, ts_ptr, &b);
+ auto status2 = db->KeyMayExist(options, key, NULL, ts_ptr, &b);
+ }
if (status.ok()) {
++found;
} else if (!status.IsNotFound()) {
And the stack trace
rocksdb::DBImpl::KeyMayExist (this=this@entry=0x7ffff735d000, read_options=..., column_family=0x7ffff723b760, key=..., value=value@entry=0x0, timestamp=timestamp@entry=0x0, value_found=0x7fffefff4e9f) at db/db_impl/db_impl.cc:3792
3792 value->assign(pinnable_val.data(), pinnable_val.size());
(gdb) where
#0 rocksdb::DBImpl::KeyMayExist (this=this@entry=0x7ffff735d000, read_options=..., column_family=0x7ffff723b760, key=..., value=value@entry=0x0, timestamp=timestamp@entry=0x0, value_found=0x7fffefff4e9f) at db/db_impl/db_impl.cc:3792
#1 0x00005555556792f0 in rocksdb::DB::KeyMayExist (value_found=0x7fffefff4e9f, timestamp=0x0, value=0x0, key=..., options=..., this=<optimized out>) at ./include/rocksdb/db.h:987
#2 rocksdb::Benchmark::ReadRandomFast (this=0x7fffffffd600, thread=0x7ffff738f800) at tools/db_bench_tool.cc:6040
#3 0x0000555555667a4a in rocksdb::Benchmark::ThreadBody (v=0x7ffff7346780) at tools/db_bench_tool.cc:3973
#4 0x00005555558bf8f9 in rocksdb::(anonymous namespace)::StartThreadWrapper (arg=0x7ffff7287130) at env/env_posix.cc:469
#5 0x00007ffff76c3ac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#6 0x00007ffff7755850 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
Steps to reproduce the behavior
hi @mdcallag ,
The issue arises from the sample code accessing the incorrect overload of the KeyMayExist method. There are two overloaded functions: one that includes a timestamp and one that does not.
// If the key definitely does not exist in the database, then this method
// returns false, else true. If the caller wants to obtain value when the key
// is found in memory, a bool for 'value_found' must be passed. 'value_found'
// will be true on return if value has been set properly.
// This check is potentially lighter-weight than invoking DB::Get(). One way
// to make this lighter weight is to avoid doing any IOs.
// Default implementation here returns true and sets 'value_found' to false
-
withtimestamp:
virtual bool KeyMayExist(const ReadOptions& /options/,
ColumnFamilyHandle* /column_family/,
const Slice& /key/, std::string* /value/,
std::string* /timestamp/,
bool* value_found = nullptr) {
if (value_found != nullptr) {
*value_found = false;
}
return true;
} -
Without Timestamp:
virtual bool KeyMayExist(const ReadOptions& options,
ColumnFamilyHandle* column_family, const Slice& key,
std::string* value, bool* value_found = nullptr) {
return KeyMayExist(options, column_family, key, value,
/timestamp=/nullptr, value_found);
}
the second overload, which omits the timestamp, works as expected because it handles without requiring a valid pointer for the value. Therefore, to achieve the desired functionality, the method without the timestamp should be called.
Still get a core dump after I changed my testcase to use the method without a timestamp
Thread 37 "db_bench.dbg" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff03f8640 (LWP 2422119)]
rocksdb::DBImpl::KeyMayExist (this=0x7ffff6f73000, read_options=..., column_family=0x7ffff6e3aac0, key=..., value=0x0, timestamp=0x0, value_found=0x7ffff03f5e9f) at db/db_impl/db_impl.cc:3792
3792 value->assign(pinnable_val.data(), pinnable_val.size());
(gdb) where
#0 rocksdb::DBImpl::KeyMayExist (this=0x7ffff6f73000, read_options=..., column_family=0x7ffff6e3aac0, key=..., value=0x0, timestamp=0x0, value_found=0x7ffff03f5e9f) at db/db_impl/db_impl.cc:3792
#1 0x000055555570846a in rocksdb::DB::KeyMayExist (this=this@entry=0x7ffff6f73000, options=..., column_family=<optimized out>, key=..., value=value@entry=0x0, value_found=value_found@entry=0x7ffff03f5e9f) at ./include/rocksdb/db.h:975
#2 0x000055555567ac9b in rocksdb::DB::KeyMayExist (value_found=0x7ffff03f5e9f, value=0x0, key=..., options=..., this=<optimized out>) at ./include/rocksdb/db.h:981
#3 rocksdb::Benchmark::ReadRandomFast (this=0x7fffffffd5b0, thread=0x7ffff6f5a800) at tools/db_bench_tool.cc:6042
#4 0x00005555556693da in rocksdb::Benchmark::ThreadBody (v=0x7ffff6f39780) at tools/db_bench_tool.cc:3973
#5 0x00005555558c2de9 in rocksdb::(anonymous namespace)::StartThreadWrapper (arg=0x7ffff6e78100) at env/env_posix.cc:469
#6 0x00007ffff7294ac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#7 0x00007ffff7326850 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
New diff is:
diff --git a/tools/db_bench_tool.cc b/tools/db_bench_tool.cc
index 713aaaa41..7998a1b2f 100644
--- a/tools/db_bench_tool.cc
+++ b/tools/db_bench_tool.cc
@@ -6034,6 +6034,13 @@ class Benchmark {
ts_ptr = &ts_ret;
}
auto status = db->Get(options, key, &value, ts_ptr);
+ {
+ bool b;
+ // auto status2 = db->KeyMayExist(options, key, &value, ts_ptr, &b);
+ // auto status2 = db->KeyMayExist(options, key, NULL, ts_ptr, &b);
+ std::string* vptr = nullptr;
+ auto status2 = db->KeyMayExist(options, key, vptr, &b);
+ }
if (status.ok()) {
++found;
} else if (!status.IsNotFound()) {
In the context of RocksDB, I see the DB class serves as the public interface that users interact with, while the DBImpl class contains the actual implementation details. The assertion assert(value != nullptr); is a safety check designed to ensure that the pointer value is not nullptr before proceeding with further operations that assume it points to a valid memory location.