googleapis/python-bigquery

Deadlock due to race condition in download thread when using Bigquery Storage API

kien-truong opened this issue · 1 comments

There is a race condition in the download thread and the main thread when using Bigquery Storage API to fetch data.

for page in rowstream.pages:
if download_state.done:
return
item = page_to_item(page)
worker_queue.put(item)

finally:
# No need for a lock because reading/replacing a variable is
# defined to be an atomic operation in the Python language
# definition (enforced by the global interpreter lock).
download_state.done = True
# Shutdown all background threads, now that they should know to
# exit early.
pool.shutdown(wait=True)

When the download thread is blocked on worker_queue.put(item), if the main thread exit, causing the pool to shut down, the download thread will be stuck. This behavior prevents the program from exiting.

@chalmerlowe can you take a look at this issue and my corresponding PR #2034? thanks