Firestore emulator: resource starvation for concurrent operations on a single document
merlinnot opened this issue · 9 comments
I was instructed by @samtstern to post Firestore Emulator - related issues in this repository, so here we go:
Getting and setting the same reference (two concurrent operations) results in a lock. There are three possible outcomes:
- operations succeed after a long period of time (~30+ seconds)
- an
UNKNOWNerror is returned (code 2) - a
GOAWAYmessage is returned
The issue is highly reproducible:
$ time npx ts-node ./test.ts
2 UNKNOWN:
real 0m32.538s
user 0m3.862s
sys 0m0.311sEmulator output:
API endpoint: http://0.0.0.0:8080
If you are using a library that supports the FIRESTORE_EMULATOR_HOST environment variable, run:
export FIRESTORE_EMULATOR_HOST=0.0.0.0:8080
Dev App Server is now running.
Mar 11, 2019 6:18:41 PM io.gapi.emulators.grpc.GrpcServer$3 operationComplete
INFO: Adding handler(s) to newly registered Channel.
Mar 11, 2019 6:18:41 PM io.gapi.emulators.netty.HttpVersionRoutingHandler channelRead
INFO: Detected HTTP/2 connection.
Mar 11, 2019 6:19:11 PM com.google.cloud.datastore.emulator.impl.util.WrappedStreamObserver onError
INFO: operation failed: null
Mar 11, 2019 6:19:11 PM com.google.cloud.datastore.emulator.impl.util.WrappedStreamObserver onError
INFO: operation failed: null
Repro:
import { Firestore } from '@google-cloud/firestore';
import { credentials } from 'grpc';
const firestore = new Firestore({
'grpc.initial_reconnect_backoff_ms': 500,
'grpc.max_reconnect_backoff_ms': 1000,
port: 8080,
projectId: 'test',
servicePath: 'localhost',
sslCreds: credentials.createInsecure(),
});
const ref = firestore.collection('collection').doc('doc');
const run = async () => {
await Promise.all([ref.get(), ref.set({})]);
};
run().catch(x => console.error(x.message));I can reproduce this, and I think I understand why this is happening. I have a fix prepared, and I'll ask some of the other backend engineers to take a look over it.
Fix submitted for the read + write contention.
There is still an issue with multiple writes introducing deadlock. For instance,
await Promise.all([ref.set({}), ref.set({})]);
will still trigger a 30s limited deadlock even after the current fix. We have a more complete fix in mind but that will be another release afterwards.
Thanks for the update, appreciate it.
The behaviour is improved with v1.4.2, but it's not fixed. If you add more calls to Promise.all it still fails, although the emulator at least shows some error message: INFO: operation failed: transaction timeout.
To clarify, do the failures you're seeing involve multiple writes?
Yes, here's a full minimal repro:
import { Firestore } from '@google-cloud/firestore';
import { credentials } from 'grpc';
const firestore = new Firestore({
'grpc.initial_reconnect_backoff_ms': 500,
'grpc.max_reconnect_backoff_ms': 1000,
port: 8080,
projectId: 'test',
servicePath: 'localhost',
sslCreds: credentials.createInsecure(),
});
const ref = firestore.collection('collection').doc('doc');
const run = async () => {
await Promise.all([ref.set({}), ref.set({})]);
};
run().catch(x => console.error(x.message));Okay, the multi-write deadlock case is a known issue, we haven't submitted the fix for it yet. I will update this issue when we do.
The fix for this has been submitted. I'll close this issue when the next release goes out.
Release went out a couple weeks ago, closing this out.