Improve performance when reloading a corpus
Closed this issue · 1 comments
cmil commented
Currently, when a corpus is loaded from its Git repository, already existing TEI documents in this corpus are deleted one by one to clean up before loading the files from the repo. For large corpora this can take a long time and also slow down the database. To speed things up the below code should be changed to remove the entire data collection at once.
Lines 81 to 87 in 88c5d29