Batches in >=2.0.0 are significantly slower
BinarSkugga opened this issue · 9 comments
I believe there is a change of API or a performance issue when using batches with 2.1.0.
Environment details
- OS type and version: Windows 10
- Python version: 3.8.7
- pip version: 20.3.3
google-cloud-datastore
version: 2.1.0google-api-core
version: 1.24.1
Steps to reproduce
- Use DatastoreClient::put_multi with 500 entities inside a batch
Code example
with client.batch():
client.put_multi(chunk)
Performance
When using 1.15.0, my code can handle 9 batches of 500 entities in about 1-2 seconds while 2.1.0 takes literal minutes.
Disclaimer: I use the datastore emulator
EDIT: It can be reproduced in 2.0.0
Are you seeing this in both emulator and live endpoints?
I haven't tested on a live endpoint. I'm in vacation for the week, I'll definitely test this when I come back.
Thanks @BinarSkugga . I know we overhauled, rather substantially, the transport layer in the v1 -> v2. It would be very helpful to understand if you are seeing a slowdown on a live endpoint.
I am back from my vacation. I do not experience this behavior with a live endpoint. Should I move the issue to the emulator's repository ?
EDIT: I can even measure an improvement in performance with the new >=2.0.
Additional potentially impactful information:
- The emulator doesn't log any errors.
- I use those batches in parrallel using Python's ThreadPoolExecutor from the concurrent package.
- Deleting entities seems also slightly slower but its negligible compared to writes and reads.
- Might be worth to note that I had to remove the 'indices' key in my Entity object as the emulator doesn't seem to understand it. It tries to parse it as a column and fails because it contains a set().
I may have talked to fast. It also happens on live endpoints.
Logs from 1.15.x:
[SOLLUM][DATA][INFO][10:42:35]: Fetching 4130 'LayoutEntity' entities...
[SOLLUM][DATA][VERBOSE][10:42:35]: Using 9 'LayoutEntity' batches...
[SOLLUM][DATA][VERBOSE][10:42:36]: Fetched 'LayoutEntity' batch #1/9
[SOLLUM][DATA][VERBOSE][10:42:36]: Fetched 'LayoutEntity' batch #2/9
[SOLLUM][DATA][VERBOSE][10:42:37]: Fetched 'LayoutEntity' batch #4/9
[SOLLUM][DATA][VERBOSE][10:42:37]: Fetched 'LayoutEntity' batch #3/9
[SOLLUM][DATA][VERBOSE][10:42:37]: Fetched 'LayoutEntity' batch #5/9
[SOLLUM][DATA][VERBOSE][10:42:37]: Fetched 'LayoutEntity' batch #6/9
[SOLLUM][DATA][VERBOSE][10:42:37]: Fetched 'LayoutEntity' batch #7/9
[SOLLUM][DATA][VERBOSE][10:42:37]: Fetched 'LayoutEntity' batch #8/9
[SOLLUM][DATA][VERBOSE][10:42:37]: Fetched 'LayoutEntity' batch #9/9
Logs from 2.1.x:
[SOLLUM][DATA][INFO][14:35:42]: Fetching 4130 'LayoutEntity' entities...
[SOLLUM][DATA][VERBOSE][14:35:42]: Using 9 'LayoutEntity' batches...
[SOLLUM][DATA][VERBOSE][14:36:48]: Fetched 'LayoutEntity' batch #1/9
[SOLLUM][DATA][VERBOSE][14:39:31]: Fetched 'LayoutEntity' batch #2/9
[SOLLUM][DATA][VERBOSE][14:39:32]: Fetched 'LayoutEntity' batch #3/9
Here's the code that handles this:
def _do_fetch_chunk(chunk, client, kind, count, total):
with client.batch():
ents = client.get_multi(chunk)
Loggers.DATA.verbose('Fetched \'{}\' batch #{}/{}'.format(kind, count.next(), total))
return ents
count = AtomicNumber()
chunked = list(chunks(keys, config.MAX_BATCH_ENTITY))
Loggers.DATA.verbose('Using {} \'{}\' batches...'.format(len(chunked), kind))
entities = ConcurrentService.i().thread_each(chunked, _do_fetch_chunk, self.client, kind, count, len(chunked))
return [item for sublist in entities for item in sublist]
The concurrent service simply submit a new task per batch in the thread pool.
Any insight on this ?
I have received another customer report. While not quite the difference stated here, 2-3x slower is about what the other instance was observing. It would be good to benchmark a few operations and see if there are any obvious gains that could be made.
Reference PR from PubSub that was able to get back some time: googleapis/python-pubsub@c29d7f8
I've been attempting to recreate this, but so far am seeing more or less similar outcomes for batch operations between versions 2.1.0
and 1.15.3
.
Querying may very well be slower, however.
First test run:
$ python datastore-test.py
[python-datastore@1.15.3] finished creation loop in 5.0 ms
[python-datastore@1.15.3] finished batch write in 1926.0 ms
[python-datastore@1.15.3] finished batch get in 391.0 ms
[python-datastore@1.15.3] finished batch update in 1706.0 ms
[python-datastore@1.15.3] finished batch query in 2216.0 ms
> loaded 3500 objects
Switch to version 2.1.0
and run again:
python datastore-test.py
[python-datastore@2.1.0] finished creation loop in 4.0 ms
[python-datastore@2.1.0] finished batch write in 2109.0 ms
[python-datastore@2.1.0] finished batch get in 588.0 ms
[python-datastore@2.1.0] finished batch update in 1701.0 ms
[python-datastore@2.1.0] finished batch query in 3702.0 ms
> loaded 4000 objects
I will continue to investigate, but for now, here's the testing script I'm using.