2D Array causes `uproot.dask` to try to pickle a thread lock
gordonwatts opened this issue · 5 comments
The following crashes with the dump below - failing to pickle a thread lock. If one comments out the indicated lines creating a client, then the code runs just fine. Inspecting the file, the culprit is a 2D branch (jet_EnergyPerSampling
) - the 1D branches do not exhibit this behavior. The data file is on CERNBox (thanks to @alexander-held).
import uproot
import awkward as ak
import multiprocessing
from dask.distributed import Client, LocalCluster
# Comment starting here and the script will work
n_workers = multiprocessing.cpu_count()
print(n_workers)
n_workers = 2
cluster = LocalCluster(
n_workers=n_workers, processes=False, threads_per_worker=1
)
client = Client(cluster)
# End comment out block
data = uproot.dask({
"/data/gwatts/test.root": "atlas_xaod_tree"
})
total = ak.count(data.jet_pt)
print(total.compute())
The crash:
(venv) [bash][gwatts]:idap-200gbps-atlas > python servicex/test.py
96
2024-04-05 23:43:58,648 - distributed.protocol.pickle - ERROR - Failed to serialize <ToPickle: HighLevelGraph with 2 layers.
<dask.highlevelgraph.HighLevelGraph object at 0x7f8890386e20>
0. <dask-awkward.lib.core.ArgsKwargsPackedFunction ob-37764e1a1a7ed10022ea822573c2855b
1. count-finalize-4f7cc442fca58343712aedce024e98e7
>.
Traceback (most recent call last):
File "/venv/lib/python3.9/site-packages/distributed/protocol/pickle.py", line 63, in dumps
result = pickle.dumps(x, **dump_kwargs)
TypeError: cannot pickle '_thread.lock' object
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/venv/lib/python3.9/site-packages/distributed/protocol/pickle.py", line 68, in dumps
pickler.dump(x)
TypeError: cannot pickle '_thread.lock' object
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/venv/lib/python3.9/site-packages/distributed/protocol/pickle.py", line 81, in dumps
result = cloudpickle.dumps(x, **dump_kwargs)
File "/venv/lib/python3.9/site-packages/cloudpickle/cloudpickle.py", line 1479, in dumps
cp.dump(obj)
File "/venv/lib/python3.9/site-packages/cloudpickle/cloudpickle.py", line 1245, in dump
return super().dump(obj)
TypeError: cannot pickle '_thread.lock' object
Traceback (most recent call last):
File "/venv/lib/python3.9/site-packages/distributed/protocol/pickle.py", line 63, in dumps
result = pickle.dumps(x, **dump_kwargs)
TypeError: cannot pickle '_thread.lock' object
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/venv/lib/python3.9/site-packages/distributed/protocol/pickle.py", line 68, in dumps
pickler.dump(x)
TypeError: cannot pickle '_thread.lock' object
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/venv/lib/python3.9/site-packages/distributed/protocol/serialize.py", line 353, in serialize
header, frames = dumps(x, context=context) if wants_context else dumps(x)
File "/venv/lib/python3.9/site-packages/distributed/protocol/serialize.py", line 76, in pickle_dumps
frames[0] = pickle.dumps(
File "/venv/lib/python3.9/site-packages/distributed/protocol/pickle.py", line 81, in dumps
result = cloudpickle.dumps(x, **dump_kwargs)
File "/venv/lib/python3.9/site-packages/cloudpickle/cloudpickle.py", line 1479, in dumps
cp.dump(obj)
File "/venv/lib/python3.9/site-packages/cloudpickle/cloudpickle.py", line 1245, in dump
return super().dump(obj)
TypeError: cannot pickle '_thread.lock' object
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/gwatts/code/iris-hep/idap-200gbps-atlas/servicex/test.py", line 21, in <module>
print(total.compute())
File "/venv/lib/python3.9/site-packages/dask/base.py", line 375, in compute
(result,) = compute(self, traverse=False, **kwargs)
File "/venv/lib/python3.9/site-packages/dask/base.py", line 661, in compute
results = schedule(dsk, keys, **kwargs)
File "/venv/lib/python3.9/site-packages/distributed/protocol/serialize.py", line 379, in serialize
raise TypeError(msg, str_x) from exc
TypeError: ('Could not serialize object of type HighLevelGraph', '<ToPickle: HighLevelGraph with 2 layers.\n<dask.highlevelgraph.HighLevelGraph object at 0x7f8890386e20>\n 0. <dask-awkward.lib.core.ArgsKwargsPackedFunction ob-37764e1a1a7ed10022ea822573c2855b\n 1. count-finalize-4f7cc442fca58343712aedce024e98e7\n>')
One additional piece of information: I'm assuming that there's something very specific to this file / how this branch is written out, as I can read the information in jet_EnergyPerSampling
just fine out of the original PHYSLITE with the same uproot.dask
approach. The file linked in this issue instead comes out of a ServiceX tranform.
Another piece of information thanks to a debugging suggestion from @lgray: when using
uproot.dask(..., open_files=False)
the crash disappears and the output looks sensible again.
I ran into this independently, and because the issue came up when cloudpickling dask.base.unpack_collections.<locals>.repack
, I thought it was a Dask-core issue. But if setting open_files=False
in Uproot fixes it, that makes it much more likely Uproot.
I think that #1200 solves this problem. It also explains why we hadn't seen it until now: the AsObjects
Interpretation applies to ElementLinks
; the lock is for making sure that AwkwardForth code-generators running on different threads don't compete in generating code—it should be generated once by one thread and then used by all.
The protection of this lock in serialization exactly follows the procedure for protecting another lock in the Model
superclass.
Awesome - thanks for getting this fixed!