using BadgerDB: DropCollection returns a "Txn is too big to fit into one request"
willie68 opened this issue · 3 comments
I just do a simple test. Importing 1.000.000 simple documents and than i call DropCollection.
The error "Txn is too big to fit into one request" returns.
I think the error comes from the Clover DropCollection Implementation, which simply tries to delete all documents in one BadgerDB transaction. However, BadgerDB transactions are limited. In my case, the end is around 35,000 documents.
Specifically, this is 15% of the table size.
For more information: dgraph-io/badger#1325
Hey, @willie68, the quickest option I see is to delete documents in batches inside DropCollection()
.
However, such an approach would make clover a bit tight to a specific storage engine, in this case badgerdb, so I'm not sure about the benefit of implementing it, as in general different storage engine may have very different characteristics and limitations.
Since you can achieve the sam, by just calling clover Delete()
methods and selecting documents in batches (using offset
and limit
), I would suggest you to do so.
Does this help you?
Thank you for the fast feedback.
First I tried bbolt. But I already failed at Query. Just a simple search with
results, err := db.FindAll(q.NewQuery(dbTable).Where(q.Field("datatime").Lt(queryTime)))
failed. (There was an index on datatime) In addition, the import performance was not sufficient for 1,000,000 data records. That's why I try to use BadgerDB. (I have already used it successfully in other projects)
#DropCollection Of course I could do this manually. Since I only have one collection at the moment, it's easier for unit tests to simply delete the file system. In the main application DropCollection
will never be executed.
However, I think it is important that functions offered, should work with all options.
Maybe you can possibly extend the store interface, and put the special implementation into the store/badger/badger.go
PS.: Just found the problem of time queries in bbolt. But the performance issue remains.
100.000 simple records inserts
bbolt: 4m22.2910021s
badger: 2.9658109s
我曾经用BadgerDB,bblotdb,pepperDB,leveldb处理过500万量级的数据,除了leveldb,其它都有各种各样的问题