vitrivr/cineast

Make Selector and PersistencyWriter more consistent with DB resource handling

ppanopticon opened this issue · 2 comments

As a a more general discussion (and maybe @lucaro can weigh in): I guess one source for confusion in PR #224 was the mapping behaviour of Selector and PersistencyWriter close() to that of the underlying database resources, which is somehow inconsistent, especially when also taking higher abstractions into account.

Selector and PersistencyWriter are Closeable and exhibit a 'state' in that they are bound to an entity. Hence, there is an argument for creating them separately for each entity / feature and closing them after using them. This is how it is currently handled by all AbstractFeature implementations in Cineast.

However, the active entity can also be changed by calling open() and the same instance be re-used, which is done in some of the unit tests. This is not optimal for two reasons: First, open() implies that it somehow negates the effect of close(), which it often doesn't! Second, it adds a lot of verbosity to the interface, which is ultimately superfluous and confusing.

I'd suggest to make following changes, to avoid this sort of confusion and steer the usage of these classes more clearly:

  1. Selector and PersistencyWriter should be permanently bound to an entity upon construction and cannot be used any longer once closed. Thus open() should be removed from their interfaces completely. Instead the name of the entity is provided when requesting an instance from the supplier.
  2. Selector and PersistencyWriter should no longer take a wrapper as an argument but connection details instead, i.e., the wrapper is constructed internally and not exposed to the outside. Hence we make it impossible to re-use the same wrapper with different Selector and PersistencyWriter instances.
  3. It is then entirely up to the wrapper to abstract away the internal handling of resources according to the requirements of the underlying database.

Any thoughts on this?

Maybe as an aside: Resource handling is a bit different for Polypheny DB and Cottontail DB:

  • For Cottontail DB, there is a n:1 relationship between the CottontailWrapper and the underlying ManagedChannel, since the Managed Channel should be re-used and is thread-safe. Hence, calling close() on the CottontailWrapper has no effect on actual resources - it's a no-op method.
  • For Polypheny DB, there is a 1:1 relationship between PolyphenyWrapper and a JDBC Connection, i.e., every PolyphenyWrapper opens its own JDBC Connection and uses it. This is necessary, because a JDBC Connection should not be shared between threads (as far as I know).

Since calls to close() on Selector and PersistencyWriter are effectively propagated to the wrapper, we cannot re-use the same wrapper for different instances of Selector and PersistencyWriter after closing them in the case of Polypheny DB. This is what caused one of the tests to fail.

I generally agree with this. open() was never intended to be called multiple times and is not supposed to be the opposite of close(), these Selectors are supposed to have a linear lifecycle. If you need a new one, make a new one, they are generally lightweight. I'm not sure what the reason was for the explicit open() call, rather than passing all of this as parameters to the constructor directly. It might very well be that, whatever the reason was, is no longer valid. So I'd be in favor of this change.

If we explicitly make the wrapper take care of shared resources, I also think it is fine if it cannot be re-used either. It might in some cases still need to be accessible though, in case some features need to access specific functionality only supported by some backends. (Ideally, this would eventually tend to zero, but we all know how that works in practice.)

TL;DR: 👍

I'd say we can make the wrapper instance public but hide it in the interface. As soon we start accessing wrapper instances directly, we're kinda violating the contract of the interface anyway.