scikit-hep/uproot5

`xxhash` is currently a strict dependency for reading RNTuples

ariostas opened this issue · 3 comments

xxhash is currently a strict dependency for reading RNTuples even though it's not listed on pyproject.toml. The dependent code is here:

xxhash = uproot.extras.xxhash()
computed_checksum = xxhash.xxh64(data).intdigest()
if computed_checksum != expected_checksum:
raise ValueError(
f"""computed checksum {computed_checksum} didn't match expected checksum {expected_checksum}
in file {chunk.source.file_path}"""
)

which could simply be removed to remove the dependency.

For writing, there will be a strict dependency and there is no way to get around it, but maybe we can make uproot.extras.xxhash pick between xxhash and ppxxh (which is pure Python, so it works in WASM).

Is the xxhash algorithm available in cramjam, which is one of Uproot's strict requirements?

Darn, it's not: milesgranger/cramjam#147

But since xxhash is in Pyodide now (ifduyue/python-xxhash#65), we can make xxhash a strict requirement for Uproot without causing installation issues for any use-cases that I can think of.

When that goes in (when xxhash becomes a strict requirement for Uproot), we'll need a new minor version number, 5.4.0.

(I'd rather not switch between libraries like xxhash and ppxxh implicitly because if there's a bug in one of them, it would be hard to reproduce. Two people could install Uproot the same way and get different libraries, one with the bug and one without. Understanding why that's happening—if we don't remember that uproot.extras.xxhash performs that switch—would be hard.)

Oh interesting, I didn't realize that xxhash was now on Pyodide. That's pretty convenient.