Wrong interaction between VariationsFor and no variations in distributed RDataFrame
Closed this issue · 0 comments
vepadulano commented
Check duplicate issues.
- Checked for duplicates
Description
If a branch of the computation graph has no variations and the user calls VariationsFor
on an action, the resulting map should have only the nominal value available. In distributed mode, a wrong interaction leads to segfaults and errors such as:
File "/home/vpadulan/programs/rootproject/rootbuild/distrdf-debug/lib/DistRDF/Backends/Base.py", line 99, in distrdf_mapper
mergeables = get_mergeable_values(rdf_plus.rdf, current_range.id, computation_graph_callable,
^^^^^^^^^^^
File "/home/vpadulan/programs/rootproject/rootbuild/distrdf-debug/lib/DistRDF/Backends/Base.py", line 59, in get_mergeable_values
mergeables = [Utils.get_mergeablevalue(action) for action in actions]
^^^^^^^^^^^^^^^^^
File "/usr/lib64/python3.13/functools.py", line 929, in wrapper
return dispatch(args[0].__class__)(*args, **kw)
^^^^^^^^^^^^^^^
File "/home/vpadulan/programs/rootproject/rootbuild/distrdf-debug/lib/DistRDF/Backends/Utils.py", line 164, in get_mergeablevalue
return ROOT.Detail.RDF.GetMergeableValue(resultptr)
^^^^^^^^^^^^^^^^^
TypeError: Template method resolution failed:
unique_ptr<ROOT::Detail::RDF::RMergeableValue<TH1D>,default_delete<ROOT::Detail::RDF::RMergeableValue<TH1D> > > ROOT::Detail::RDF::GetMergeableValue(ROOT::RDF::RResultPtr<TH1D>& rptr) =>
TypeError: could not convert argument 1
unique_ptr<ROOT::Detail::RDF::RMergeableVariations<TH1D>,default_delete<ROOT::Detail::RDF::RMergeableVariations<TH1D> > > ROOT::Detail::RDF::GetMergeableValue(ROOT::RDF::Experimental::RResultMap<TH1D>& rmap) =>
SegmentationViolation: segfault in C++; program state was reset
Failed to instantiate "GetMergeableValue(ROOT::RDF::Experimental::RResultMap<TH1D>*)"
Failed to instantiate "GetMergeableValue(ROOT::RDF::Experimental::RResultMap<TH1D>)"
FYI @gpetruc
Reproducer
import ROOT
from dask.distributed import Client, LocalCluster
RunGraphs = ROOT.RDF.Experimental.Distributed.RunGraphs
VariationsFor = ROOT.RDF.Experimental.Distributed.VariationsFor
RDataFrame = ROOT.RDF.Experimental.Distributed.Dask.RDataFrame
def create_connection() -> Client:
cluster = LocalCluster(n_workers=1, threads_per_worker=1, processes=True)
client = Client(cluster)
return client
def run(conn):
h = RDataFrame(1, daskclient=conn).Define("x", "42").Histo1D(("h","h",1,0,10),"x")
vars = VariationsFor(h)
h.GetValue()
print(f"{h.GetEntries()=},{vars.GetKeys()=}")
if __name__ == "__main__":
with create_connection() as conn:
run(conn)
ROOT version
Any
Installation method
Any
Operating system
Any
Additional context
No response