Suggestion: add dims to signal descriptor
Opened this issue · 5 comments
Summary
While many signals are scalar-valued, signals that are array-like may have a notion of axis values. This could be the spectrum from an MCA, that has energy values associated with each bin, or an imaging detector that can report pixel position, or a spectrometer that reports angle, etc.
Currently, the best way to put axis information in would be as a configuration signal, so that the axis values are read once, and then stored. However, there is no [standard] way to link the axis values to the actual data signal.
I propose modifying Signal
to take a dims
argument at instantiation that would be automatically put into the descriptor document in order to link the axis values to the data values.
Detailed Rationale
Why Modify Signal?
event-model
already has a field for dims
in the event descriptor document. So given a detector
class MyDetector(Device):
data_signal=Cpt(Signal, ..., )
axis_signal = Cpt(Signal, ..., kind='configuration')
it would be possible to override describe
in the following way
def describe(self):
desc = super().describe()
desc['data_signal']['dims'] = [self.axis_signal.name]
return desc
However, this would lead to a lot of boilerplate that also needs to be updated if any of the signals change. Some devices may have different modes that grab different axis values. (i.e, a signal that could come in as a 2-d image, or a 1-d summed histogram, depending on the device mode).
A cleaner implementation
I believe it would be easier for ophyd object-writers if Signal
took a dims
argument, so that the same class could look like:
class MyDetector(Device):
data_signal=Cpt(Signal, ..., dims=['axis_signal'])
axis_signal = Cpt(Signal, ..., kind='configuration')
Then, the describe code in Signal
itself could be as follows:
def describe(self):
...
desc['dims'] = [getattr(self.parent, dim_name).name for dim_name in self.dims]
return desc
More code would be required to check if Signal
even has a parent, deal with missing attributes, etc. Sane default behavior is up for discussion. If no dims
is passed into Signal
, the best behavior would be to not add any dims
key to the descriptor.
For mode-switching detectors, the data_signal.dims
attribute could be modified as necessary at any time before describe()
is called.
How this would be useful
Once dims
is in the descriptor, for a device mydet = MyDetector(...)
, the key run.primary.descriptors[0]['data_keys']['mydet_data_signal']['dims']
would have the value mydet_axis_signal
.
This can then be automatically followed to run.primary.descriptors[0]['configuration']['mydet']['mydet_axis_signal']
to get the axis values, without any "hard-coding" of the relationship. The axis values for any signal following this convention could be automatically grabbed, whether for data export or plotting.
Intent
I would love to get feedback on this idea, and if desired, submit a PR to add this functionality to Signal
From my perspective I like this approach. It generalizes well to 2D and even 3D detectors by just changing the length of the list returned by axis_signal.dims
. I have a particular detector that we will implement soon (next few years) that, depending on the mode, can return either 1D, 2D or 3D data and where the 'axes dimensions' for each of these can also vary depending on the mode giving 2 possible 1D, 4 2D or 2 3D modes (for a total of 8 different options for 'dims') from the same detector. I can see this approach will generalize well by connecting a change in the 'mode' Signal
of the detector ophyd object to an update of the axis_signal.dims
attribute as described by @cjtitus.
I think this is a pretty cool idea.
I'm interested what @coretl thinks and whether it's something that's already offered or planned in ophyd-async
Thanks for the write-up, and for bringing this to my attention, I'd missed that you can label dims in event-model!
One question on the above is why you separate the value of the axis_signal (returned in read_configuration
) from the dims link to that signal (returned in describe
)? My understanding is that describe_configuration
, read_configuration
and describe
are all called at the same time, so the only benefit is if this axis signal is used as a label for many channels. I will assume that this is the case for the rest of my answer.
The current situation in ophyd-async
is that you can put what you like in Device.describe
, but there is nothing to explicitly help you define dims
. To provide better support we can split into 2 categories:
- File-writing detectors based on
StandardDetector
have some logic to create the DataKey, and already take the idea of aNameProvider
andShapeProvider
to feed into this. We could extendShapeProvider
to also provide labels for this shape which could be used fordims
- PV reading detectors based on
StandardReadable
take helpers such asHintedSignal
that add metadata to an existing signal read. We could make aDimsSignal
that added this information for one or many signals, with caching added as appropriate
I may be wrong, but I think this question is also related to #1193 and #1178 - only talking about Ophyd v1 (this is what I work with), I propose to give some more defined description to Signal using a NumericValueInfo
with shape, data type and default value ; it can be extended with dims too or whatever information would be needed for a Signal. This information would be returned by describe()
Thanks to #1194 , it is now possible to specify the shape of data (hence, dimensions if we consider ndims = len(shape)
) as a keyword argument to Signal constructor:
>>> from ophyd.signal import Signal
>>> s=Signal(name="mysignal", value=[[0,1],[2,3]], dtype=float, shape=(2, 2))
>>> s.describe()
{'mysignal': {'source': 'SIM:mysignal', 'dtype': 'array', 'shape': [2, 2], 'dtype_numpy': 'float64'}}