scikit-hep/uproot5

`uproot` cannot write root files over xrootd.

ikrommyd opened this issue · 7 comments

To reproduce:

import uproot

file = uproot.recreate("root://cmseos.fnal.gov//store/user/ikrommyd/test/dummy.root")
file["tree"] = {"branch": [1,2,3]}
file.close()

fails with:

OSError                                   Traceback (most recent call last)
Cell In[2], line 1
----> 1 uproot.recreate("root://cmseos.fnal.gov//store/user/ikrommyd/test/dummy.root")

File ~/miniforge3/envs/work/lib/python3.11/site-packages/uproot/writing/writable.py:110, in recreate(file_path, **options)
    106 file_path = uproot._util.regularize_path(file_path)
    107 storage_options = {
    108     key: value for key, value in options.items() if key not in recreate.defaults
    109 }
--> 110 sink = uproot.sink.file.FileSink(file_path, **storage_options)
    111 compression = options.pop("compression", create.defaults["compression"])
    113 initial_directory_bytes = options.pop(
    114     "initial_directory_bytes", create.defaults["initial_directory_bytes"]
    115 )

File ~/miniforge3/envs/work/lib/python3.11/site-packages/uproot/sink/file.py:53, in FileSink.__init__(self, urlpath_or_file_like, **storage_options)
     51 else:
     52     if not self._file_exists(urlpath_or_file_like, **storage_options):
---> 53         self._truncate_file(urlpath_or_file_like, **storage_options)
     55     self._open_file = fsspec.open(
     56         urlpath_or_file_like, mode="r+b", **storage_options
     57     )

File ~/miniforge3/envs/work/lib/python3.11/site-packages/uproot/sink/file.py:81, in FileSink._truncate_file(cls, urlpath, **storage_options)
     79 parent_directory = fs.sep.join(local_path.split(fs.sep)[:-1])
     80 fs.mkdirs(parent_directory, exist_ok=True)
---> 81 fs.touch(local_path, truncate=True)

File ~/miniforge3/envs/work/lib/python3.11/site-packages/fsspec/asyn.py:118, in sync_wrapper.<locals>.wrapper(*args, **kwargs)
    115 @functools.wraps(func)
    116 def wrapper(*args, **kwargs):
    117     self = obj or args[0]
--> 118     return sync(self.loop, func, *args, **kwargs)

File ~/miniforge3/envs/work/lib/python3.11/site-packages/fsspec/asyn.py:103, in sync(loop, func, timeout, *args, **kwargs)
    101     raise FSTimeoutError from return_result
    102 elif isinstance(return_result, BaseException):
--> 103     raise return_result
    104 else:
    105     return return_result

File ~/miniforge3/envs/work/lib/python3.11/site-packages/fsspec/asyn.py:56, in _runner(event, coro, result, timeout)
     54     coro = asyncio.wait_for(coro, timeout=timeout)
     55 try:
---> 56     result[0] = await coro
     57 except Exception as ex:
     58     result[0] = ex

File ~/miniforge3/envs/work/lib/python3.11/site-packages/fsspec_xrootd/xrootd.py:380, in XRootDFileSystem._touch(self, path, truncate, **kwargs)
    376     status, _ = await _async_wrap(self._myclient.truncate)(
    377         path, size=0, timeout=self.timeout
    378     )
    379     if not status.ok:
--> 380         raise OSError(f"File not touched properly: {status.message}")
    381 else:
    382     len = await self._info(path)

OSError: File not touched properly: [ERROR] Server responded with an error: [3013] Unable to truncate (null); Operation not supported

The below reproducers work fine locally but will fail if "dummy" becomes a remote xrootd path.

import awkward as ak
import dask_awkward as dak
import uproot

# Eager
array = ak.Array({"x":[1,2,3], "y":[4,5,6]})
file = uproot.recreate("dummy.root")
file["tree"] = array

# Lazy
x = dak.from_awkward(array, 3)
uproot.dask_write(x, "dummy", prefix="whatever", compute=True)

@jpivarski when you find some time, please look into this as well.
Also mentioning @martindurant cause I don't know if fsspec is relevant to this.

The exception seems to be coming from xrootd, not fsspec. I can't say why the file didn't truncate.

@amadio gave us an alternative in xrootd/xrootd#2304 (comment), which can replace

@classmethod
def _truncate_file(cls, urlpath: str, **storage_options) -> None:
"""
Args:
urlpath (str): A filesystem URL that specifies the file to truncate by fsspec.
Truncates the file to zero bytes. Creates parent directories if necessary.
"""
fs, local_path = fsspec.core.url_to_fs(urlpath, **storage_options)
parent_directory = fs.sep.join(local_path.split(fs.sep)[:-1])
fs.mkdirs(parent_directory, exist_ok=True)
fs.touch(local_path, truncate=True)

(which calls truncate in fsspec-xrootd). It is only ever called from

if not self._file_exists(urlpath_or_file_like, **storage_options):
self._truncate_file(urlpath_or_file_like, **storage_options)

@amadio's alternative involves opening the file with a different set of flags.

@maxgalli, do you think that can be implemented in Uproot, or does that fix need to go into fsspec-xrootd? I was at first thinking that we could do it, but after looking at the code references, it doesn't look like we have direct access to the XRootD.client.File object (anymore, now that we've adopted fsspec).

@amadio gave us an alternative in xrootd/xrootd#2304 (comment), which can replace

@classmethod
def _truncate_file(cls, urlpath: str, **storage_options) -> None:
"""
Args:
urlpath (str): A filesystem URL that specifies the file to truncate by fsspec.
Truncates the file to zero bytes. Creates parent directories if necessary.
"""
fs, local_path = fsspec.core.url_to_fs(urlpath, **storage_options)
parent_directory = fs.sep.join(local_path.split(fs.sep)[:-1])
fs.mkdirs(parent_directory, exist_ok=True)
fs.touch(local_path, truncate=True)

(which calls truncate in fsspec-xrootd). It is only ever called from

if not self._file_exists(urlpath_or_file_like, **storage_options):
self._truncate_file(urlpath_or_file_like, **storage_options)

@amadio's alternative involves opening the file with a different set of flags.

@maxgalli, do you think that can be implemented in Uproot, or does that fix need to go into fsspec-xrootd? I was at first thinking that we could do it, but after looking at the code references, it doesn't look like we have direct access to the XRootD.client.File object (anymore, now that we've adopted fsspec).

Hi @jpivarski, sorry for the late reply. It seems to me that implementing Guillerme's suggestion into fsspec-xrootd might be the way to go, but I'm not sure as I'm not familiar with the code yet. I will look into this in more detail in a few weeks from now, after thesis submission (I would thus keep the issue open for now if it is ok).

I'm confused, why do we need to have truncate=True in the touch command if it is only called when the file doesn't exist?

Since you later want to open a file with truncate on open, I would suggest to use that interface in fsspec-xrootd. Open the file with mode w once to touch/create/truncate it, then open it for r+b later.

Furthermore, I suspect interacting in read-write mode with a remote filesystem is not a great idea. We should encourage users to add the simplecache:: (see fsspec docs) prefix to the destination so a local temporary file can be opened, edited and finalized, before it is written once to the remote server on close.