running extract_data_set_features on some PatchSeq datasets fails
danielsf opened this issue · 1 comments
Describe the bug
There are some nwb
files in the PatchSeq dataset which cause extract_data_set_features
to crash.
The error seems to originate in SweepSet.align_to_start_of_epoch
in ipfx/sweep.py
. There are some sweeps in the offending files with 'experiment': None, 'stim': None
which causes a warning at line 89,
start_idx, end_idx = sweep.epochs[epoch_name]
when epoch_name='experiment'
, which craches out of the entire method.
To Reproduce
This code will download one of the offending files (there are actually four offending files offered, if you want to try all of them) and run the feature extraction. I am open to the possibility that this is a problem with the data, rather than the code. I'm not sure if we should expect a sweep in an nwb
file to have 'experiment': None
.
from ipfx.dataset.create import create_ephys_data_set
from ipfx.data_set_features import extract_data_set_features
import urllib.request
import os
url0 = 'https://girder.dandiarchive.org/api/v1/item/5edb2cb42dace54b6f9b35f6/download'
url1 = 'https://girder.dandiarchive.org/api/v1/item/5ee84eed17a31a38dab096f2/download'
url2 = 'https://girder.dandiarchive.org/api/v1/item/5edb2e902dace54b6f9b3aa1/download'
url3 = 'https://girder.dandiarchive.org/api/v1/item/5ee84cdb17a31a38dab09229/download'
url_to_tmp = []
url_to_tmp.append((url0, 'nwb_tmp0.nwb'))
url_to_tmp.append((url1, 'nwb_tmp1.nwb'))
url_to_tmp.append((url2, 'nwb_tmp2.nwb'))
url_to_tmp.append((url3, 'nwb_tmp3.nwb'))
url, tmp = url_to_tmp[0]
if not os.path.exists(tmp):
urllib.request.urlretrieve(url, tmp)
dataset = create_ephys_data_set(tmp)
features = extract_data_set_features(dataset)
Expected behavior
I'm not sure, but I had hoped to just be able to run extract_data_set_features
on the PatchSeq datasets to get features for our web-app.
Actual Behavior
Here is the traceback I get
WARNING:root:cannot unpack non-iterable NoneType object
Traceback (most recent call last):
File "test_ephys_failure.py", line 24, in <module>
features = extract_data_set_features(dataset)
File "/Users/scott.daniel/AllenInstitute/miniconda3/envs/allen_sdk/lib/python3.7/site-packages/IPFX-1.0.1-py3.7.egg/ipfx/data_set_features.py", line 378, in extract_data_set_features
sweep_features[s['sweep_number']]['peak_deflect'] = s['peak_deflect']
KeyError: 4
sweep_features
is empty because the error in SweepSet.align_to_start_of_epoch
causes extract_sweep_features
to exit early (I think)
Environment (please complete the following information):
- OS & version: OSX 10.15.5
- Python version 3.7.9
- AllenSDK version 2.2.0
Do you want to work on this issue?
I am willing to work on this, but I am not sure what the code should do in this case (i.e. is the data ill-formed, or should this code run).
Sorry. Running the dataset through ipfx.utilities.drop_failed_sweeps()
resolves this issue.