Suggestion: add dims to signal descriptor

Question

Suggestion: add dims to signal descriptor

Opened this issue 5 months ago · 5 comments

Summary

While many signals are scalar-valued, signals that are array-like may have a notion of axis values. This could be the spectrum from an MCA, that has energy values associated with each bin, or an imaging detector that can report pixel position, or a spectrometer that reports angle, etc.

Currently, the best way to put axis information in would be as a configuration signal, so that the axis values are read once, and then stored. However, there is no [standard] way to link the axis values to the actual data signal.

I propose modifying Signal to take a dims argument at instantiation that would be automatically put into the descriptor document in order to link the axis values to the data values.

Detailed Rationale

Why Modify Signal?

event-model already has a field for dims in the event descriptor document. So given a detector

class MyDetector(Device):
    data_signal=Cpt(Signal, ..., )
    axis_signal = Cpt(Signal, ..., kind='configuration')

it would be possible to override describe in the following way

def describe(self):
    desc = super().describe()
    desc['data_signal']['dims'] = [self.axis_signal.name]
    return desc

However, this would lead to a lot of boilerplate that also needs to be updated if any of the signals change. Some devices may have different modes that grab different axis values. (i.e, a signal that could come in as a 2-d image, or a 1-d summed histogram, depending on the device mode).

A cleaner implementation

I believe it would be easier for ophyd object-writers if Signal took a dims argument, so that the same class could look like:

class MyDetector(Device):
    data_signal=Cpt(Signal, ..., dims=['axis_signal'])
    axis_signal = Cpt(Signal, ..., kind='configuration')

Then, the describe code in Signal itself could be as follows:

def describe(self):
    ...
    desc['dims'] = [getattr(self.parent, dim_name).name for dim_name in self.dims]
   return desc

More code would be required to check if Signal even has a parent, deal with missing attributes, etc. Sane default behavior is up for discussion. If no dims is passed into Signal, the best behavior would be to not add any dims key to the descriptor.

For mode-switching detectors, the data_signal.dims attribute could be modified as necessary at any time before describe() is called.

How this would be useful

Once dims is in the descriptor, for a device mydet = MyDetector(...), the key run.primary.descriptors[0]['data_keys']['mydet_data_signal']['dims'] would have the value mydet_axis_signal.

This can then be automatically followed to run.primary.descriptors[0]['configuration']['mydet']['mydet_axis_signal'] to get the axis values, without any "hard-coding" of the relationship. The axis values for any signal following this convention could be automatically grabbed, whether for data export or plotting.

Intent

I would love to get feedback on this idea, and if desired, submit a PR to add this functionality to Signal

@tacaswell @awalter-bnl

Answer 1 · 2024-07-13T12:35:30.000Z

From my perspective I like this approach. It generalizes well to 2D and even 3D detectors by just changing the length of the list returned by axis_signal.dims. I have a particular detector that we will implement soon (next few years) that, depending on the mode, can return either 1D, 2D or 3D data and where the 'axes dimensions' for each of these can also vary depending on the mode giving 2 possible 1D, 4 2D or 2 3D modes (for a total of 8 different options for 'dims') from the same detector. I can see this approach will generalize well by connecting a change in the 'mode' Signal of the detector ophyd object to an update of the axis_signal.dims attribute as described by @cjtitus.

Answer 2 · 2024-07-15T05:49:13.000Z

I think this is a pretty cool idea.

I'm interested what @coretl thinks and whether it's something that's already offered or planned in ophyd-async

Answer 3 · 2024-07-15T10:45:29.000Z

Thanks for the write-up, and for bringing this to my attention, I'd missed that you can label dims in event-model!

One question on the above is why you separate the value of the axis_signal (returned in read_configuration) from the dims link to that signal (returned in describe)? My understanding is that describe_configuration, read_configuration and describe are all called at the same time, so the only benefit is if this axis signal is used as a label for many channels. I will assume that this is the case for the rest of my answer.

The current situation in ophyd-async is that you can put what you like in Device.describe, but there is nothing to explicitly help you define dims. To provide better support we can split into 2 categories:

File-writing detectors based on StandardDetector have some logic to create the DataKey, and already take the idea of a NameProvider and ShapeProvider to feed into this. We could extend ShapeProvider to also provide labels for this shape which could be used for dims
PV reading detectors based on StandardReadable take helpers such as HintedSignal that add metadata to an existing signal read. We could make a DimsSignal that added this information for one or many signals, with caching added as appropriate

Answer 4 · 2024-07-23T12:44:14.000Z

I may be wrong, but I think this question is also related to #1193 and #1178 - only talking about Ophyd v1 (this is what I work with), I propose to give some more defined description to Signal using a NumericValueInfo with shape, data type and default value ; it can be extended with dims too or whatever information would be needed for a Signal. This information would be returned by describe()

Answer 5 · 2024-11-12T17:34:14.000Z

Thanks to #1194 , it is now possible to specify the shape of data (hence, dimensions if we consider ndims = len(shape) ) as a keyword argument to Signal constructor:

>>> from ophyd.signal import Signal
>>> s=Signal(name="mysignal", value=[[0,1],[2,3]], dtype=float, shape=(2, 2))
>>> s.describe()
{'mysignal': {'source': 'SIM:mysignal', 'dtype': 'array', 'shape': [2, 2], 'dtype_numpy': 'float64'}}