flaxsearch/luwak

Monitor does not add queries with null metadata

aalbahem opened this issue · 6 comments

I was working in a project where I needed to wrap luwak as a service. I was adding some metadata. Some of these metadata happen to be null. However, Luwak did not add them. This test case demos the bug:


try (Monitor monitor = new Monitor(new LuceneQueryParser("field"), new MatchAllPresearcher())) {
            HashMap<String, String> metadataMap = new HashMap<>();
            metadataMap.put("key", null);

            monitor.update(new MonitorQuery(Integer.toString(1), "+test " + Integer.toString(1), metadataMap));

            Assertions.assertThat(monitor.getQueryCount()).isEqualTo(1);
        }

The test failed.

However, when I investigated it further, I have found that a NPE was thrown and caught in the update method that get added to an error list. However, this behavior is confusing and hard to catch. See the code snippet below.

for (MonitorQuery query : queries) {
            try {
                for (QueryCacheEntry queryCacheEntry : decomposeQuery(query)) {
                    updates.add(new Indexable(query.getId(), queryCacheEntry, buildIndexableQuery(query.getId(), query, queryCacheEntry)));
                }
            } catch (Exception e) {
                **errors.add(new QueryError(query.getId(), query.getQuery(), e.getMessage()));**
            }
            if (updates.size() > commitBatchSize) {
                commit(updates);
                updates.clear();
            }
        }

In addition, the Readme file shows an example of using the library where the update method is called without reference to the idea that it could return a list of errors, which we need to check them. I spent hours trying to understand what went wrong. I hope there could be a better way to make the code reports errors by itself. For instance, throw an exception if there is any error.

Regards
Ameer

Hi, thanks for opening the issue. We certainly ought to be able to deal with null values in metadata; I'll push a fix.

thanks Alan, please see my update to the bug. I hope that helps.

Having played with this for a bit I do think we should probably not support null values in metadata, as there are a bunch of places where it's assumed that it will be a materialized String.

The reason the update() method doesn't throw an Exception is because it can take a List of queries, which can be partially successful. But the API is a bit trappy here, you're right. Maybe instead it should throw an UpdateException, with suppressed exceptions for all the errors caught?

I will add that it will be very helpful to point this problem in the Readme file.

So there are three issues here, really:

  • The API is trappy and should be improved, preferably by throwing a checked exception if an error occurs during an update
  • Allowing MonitorQuery to have null metadata values causes problems which are not easily tracked down
  • The documentation needs to be improved

I'll deal with these three in separate commits.

Thanks for your feedback @AmeerTawfik !