Level/community

Proposal: Add `db.has(key)` and `db.hasMany(keys)`

Opened this issue ยท 12 comments

I love working with the Level stack. However, I miss one feature: a way to check any db whether it has a value for a key, without actually retrieving the value. Right now, as a workaround, I extend LevelUp instances with has and hasMany keys like so (TypeScript):

	db.has = async function (
		key: K,
		options?: AbstractOptions,
	): Promise<boolean> {
		try {
			await this.get(key, options);
			return true;
		} catch (e) {
			if (e.notFound) {
				return false;
			} else {
				throw e; // rethrow other exceptions
			}
		}
	};

// similar for hasMany using getMany mapped with `x => !(x === undefined)`

However, this is not as efficient as it could be, since the underlying storage can probably implement has much more efficient than first actually retrieving the value, just for it to be abandoned in this workaround.

This could even be implemented in levelup first usind above code, if levelup itself cannot find functions has or hasMany on the abstract-leveldown passed to it. This way, leveldown implementations can one by one add has/hasMany while levelup can already provide the functionality in a backwards-compatible fashion.

I think currently the most performant way is this:

let exists = false
const it = db.iterator({
  values: false,
  gte: key,
  lte: key,
  limit: 1,
})
for await (const _ of it) {
  exists = true
}

It's not a nice way though.

@vweevers might be able to tell you more about any ongoing work on this

Until more folks express the need for this feature (give it a thumbs up if you do) this can be implemented in userland. Similar to level-exists but optimized to use an iterator instead of a stream, like @juliangruber's example above.

It might also help to set the keyAsBuffer option to false (or keyEncoding to utf8) depending on which modules you're using - leveldown for example is typically faster at fetching strings than buffers.

If your values are small (and thus fast to copy from the underlying store) I reckon that db.get(key) is faster than an iterator though, because it's a single call, while an iterator needs 3 calls (creating the iterator, then nexting, then ending).

For hasMany(keys) you could perhaps use iterator.seek() but that'd require some tinkering with options to get the performance right. That API would benefit more from being a builtin feature, because the main cost is in crossing the JS/C++ boundary for every key (rather than just once for all keys).

I think currently the most performant way is this: ...

Thanks for this alternative idea. I'd think, though, that in most cases this would also fetch the value from the underlying store, because the (const _ of it) loop would still call it.next() a single time, which would populate { done: true/false, value: ... }. I think it's equivalent to

const it = db.iterator({
  values: false,
  gte: key,
  lte: key,
  limit: 1,
})
return !it.next().done;

so even though we don't read it.next().value, it probably would still be fetched by most, if not all, iterator implementation from the underlying store.

With values: false it should only read keys. You are right that the iteration mechanics can be changed

Ah yes, sorry, I oversaw the values: false setting ๐Ÿ™ˆ

Re: keys vs values, see also Level/abstract-leveldown#380 which proposes key-only and value-only iterators.

Just looked at https://github.com/juliangruber/level-exists and realize that it really is my db.has implemented in userland (but using streams). Three questions for @juliangruber:

  1. Why do you use createKeyStream if iterators are more performant?
  2. This package only offers the callback API, if I understand the code correctly. Do you plan to extend it to also include the Promise API?
  3. Do you plan to extend it to include an existsMany(keys)?

I can answer question 1: in the past, iterators weren't directly exposed (and async iterators weren't a thing) - level was all about streams

@vweevers I understand that level-exists basically implements my proposal in userland. Wouldn't you agree, though, that for some leveldown implementations, having them directly implement has and hasMany could result in better performance? It seems to me that going the detour over a db.iterator({ values: false, gte: key, lte: key, limit: 1, }) would still incur some costs as opposed to a very simple db.has implementation. And, as you said yourself, there isn't really a well performing implementation of hasMany possible in userland.

Anyway, looking forward to other community members' feedback. Maybe I'm not alone in need of this feature :)

Yes, it would undoubtedly perform better. Needs upvotes because new features come with a maintenance cost. Other than that I have no objections; your proposal aligns well with existing features (I like the symmetry with get() and getMany()).

  • This package only offers the callback API, if I understand the code correctly. Do you plan to extend it to also include the Promise API?
  • Do you plan to extend it to include an existsMany(keys)?

Not at this time but pull requests welcome ๐Ÿ‘ Feel free to open an issue there