catalystneuro/neuroconv

[Bug]: OpenEphysRecordingInterface doesnt have get_stream_names

Closed this issue · 5 comments

What happened?

OpenEphysRecordingInterface doesn't have a get_stream_names() method, even though it requires stream_names.

Steps to Reproduce

stream_names = OpenEphysRecordingInterface.get_stream_names(ephys_folder_path)

Traceback

Traceback (most recent call last):
  File "/Users/pauladkisson/Documents/CatalystNeuro/SchneiderConv/schneider-lab-to-nwb/src/schneider_lab_to_nwb/schneider_2024/schneider_2024_convert_session.py", line 78, in <module>
    main()
  File "/Users/pauladkisson/Documents/CatalystNeuro/SchneiderConv/schneider-lab-to-nwb/src/schneider_lab_to_nwb/schneider_2024/schneider_2024_convert_session.py", line 71, in main
    session_to_nwb(
  File "/Users/pauladkisson/Documents/CatalystNeuro/SchneiderConv/schneider-lab-to-nwb/src/schneider_lab_to_nwb/schneider_2024/schneider_2024_convert_session.py", line 30, in session_to_nwb
    stream_names = OpenEphysRecordingInterface.get_stream_names(ephys_folder_path)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: type object 'OpenEphysRecordingInterface' has no attribute 'get_stream_names'

Operating System

macOS

Python Executable

Conda

Python Version

3.12

Package Versions

No response

Code of Conduct

Why do we need this? Spikeinterface interfaces should not require stream exposure in most cases. Can you explain me where this is coming from and some context?

Related:
#1023

For the Schneider Lab Conversion I am converting some multi-stream OpenEphys data, and I naively used the generic router OpenEphysRecordingInterface without specifying a stream_name, which threw an error bc I needed to specify one of the two streams.

But then to actually access the stream names I had to use the specific OpenEphysLegacyRecordingInterface bc OpenEphysRecordingInterface doesn't have a get_streams method defined.

I see. That make sense.

And what streams are available on your data? Do both streams correspond to data that should be written with a RecordingInterface (acquisition or processed things that should be an ElectricalSeries)?

The reason that I am asking IS that in a lot of cases there is only one or two streams that should be written as ElectricalSeries. Exposing the stream as an argument in neuroconv opens the door for people to write data with the incorrect data type in nwb which we don't want to enable. See this:

#994

And I was discussing with Sam this morning and the stream of OpenEphys is probably not narrow enough in most cases so you should be guarded against this case: #1023.

That is all good context to be aware of, thanks.

The streams on my data are "Signals CH" (neural data) and "Signals AUX" (some kind of input signal), so probably only Signals CH should be written as an ElectricalSeries. But I think it's reasonable to expect a user (at least an educated one) to be able to select the streams appropriately.

I think that long term most formats should fix the stream as here:

class IntanRecordingInterface(BaseRecordingExtractorInterface):
"""
Primary data interface class for converting Intan data using the
:py:class:`~spikeinterface.extractors.IntanRecordingExtractor`.
"""
display_name = "Intan Recording"
associated_suffixes = (".rhd", ".rhs")
info = "Interface for Intan recording data."
stream_id = "0" # This are the amplifier channels, corresponding to the stream_name 'RHD2000 amplifier channel'
@classmethod
def get_source_schema(cls) -> dict:
source_schema = super().get_source_schema()
source_schema["properties"]["file_path"]["description"] = "Path to either a .rhd or a .rhs file"
return source_schema
def __init__(
self,
file_path: FilePath,
stream_id: Optional[str] = None,
verbose: bool = True,
es_key: str = "ElectricalSeries",
ignore_integrity_checks: bool = False,
):
"""
Load and prepare raw data and corresponding metadata from the Intan format (.rhd or .rhs files).
Parameters
----------
file_path : FilePathType
Path to either a rhd or a rhs file
stream_id : str, optional
The stream of the data for spikeinterface, "0" by default.
verbose : bool, default: True
Verbose
es_key : str, default: "ElectricalSeries"
ignore_integrity_checks, bool, default: False.
If True, data that violates integrity assumptions will be loaded. At the moment the only integrity
check performed is that timestamps are continuous. If False, an error will be raised if the check fails.
"""
if stream_id is not None:
warnings.warn(
"Use of the 'stream_id' parameter is deprecated and it will be removed after September 2024.",
DeprecationWarning,
)
self.stream_id = stream_id
else:
self.stream_id = "0" # These are the amplifier channels or to the stream_name 'RHD2000 amplifier channel'
init_kwargs = dict(
file_path=file_path,
stream_id=self.stream_id,
verbose=verbose,
es_key=es_key,
all_annotations=True,
)
neo_version = get_package_version(name="neo")
spikeinterface_version = get_package_version(name="spikeinterface")
if neo_version < Version("0.13.1") or spikeinterface_version < Version("0.100.10"):
if ignore_integrity_checks:
warnings.warn(
"The 'ignore_integrity_checks' parameter is not supported for neo versions < 0.13.1. "
"or spikeinterface versions < 0.101.0.",
UserWarning,
)
else:
init_kwargs["ignore_integrity_checks"] = ignore_integrity_checks
super().__init__(**init_kwargs)

As I know that format well I am highly confident that that's the only stream that you should be written as an ElectricalSeries. With OpenEphys I am less certain if that is the case. I will be discussing with Alessio this Friday who knows the format well and inquire.

I really don't want to enable the possiblity of writting wrong data types with our library I think that stream should not be accessible in most cases (there are exceptions).

That said, right now a lot of interfaces have a get_streams method, right? So your PR just makes things uniform.

I just wanted to make you aware, I will probably make a separate issue where I will re-write in a more organized form the considerations above so everyone can provide feedback because long-term I think streams should not be a concern for users of neuroconv.