Fix resource issues with embeddings indexing components backed by databases
Closed this issue · 0 comments
davidmezzetti commented
Currently, there are scenarios where embeddings index components backed by a database (i.e. pgvector) have issues with upserts that delete all existing data.
The following issues have been identified.
- Passing the SQLAlchemy engine to table DDL statements. This wraps the operation with another layered transaction.
- Passing the SQLAlchemy engine to the database session. This is causing locking behavior within the same database component.
- For ANNs backed by databases, the
close
method must be run before recreating a new ANN. Logic should be added to ensure this.
This work will address these issues and ensure that database-connected indexing components have all their actions run through a single transaction until a save
is called. This ensures consistency with file-based components.