tables.exceptions.HDF5ExtError: Problems creating the Array
2533245542 opened this issue · 2 comments
Hi,
I installed MIMIC_Extract and was able to do a test run with population=100 (by calling mimic_direct_extract.py). However, when I was trying to run it with the complete population, I got an error.
Here is a traceback of the error. Any idea on how to fix this? I installed the MIMIC database with Docker so MIMIC_Extract was running within a Docker container, but the storage space (>100GB left on the device) and available memory (>50GB) is not an issue here.
No known ranges for Basophils
No known ranges for pH urine
Glucose had 528 / 863595 rows cleaned:
8 rows were strict outliers, set to np.nan
520 rows were low valid outliers, set to 33.00
0 rows were high valid outliers, set to 2000.00
No known ranges for Systemic Vascular Resistance
Height had 12 / 15182 rows cleaned:
8 rows were strict outliers, set to np.nan
0 rows were low valid outliers, set to 0.00
4 rows were high valid outliers, set to 240.00
Sodium had 22 / 425997 rows cleaned:
0 rows were strict outliers, set to np.nan
20 rows were low valid outliers, set to 50.00
2 rows were high valid outliers, set to 225.00
No known ranges for Lymphocytes ascites
Anion gap had 130 / 208219 rows cleaned:
9 rows were strict outliers, set to np.nan
108 rows were low valid outliers, set to 5.00
13 rows were high valid outliers, set to 50.00
Shape of X : (2200954, 312)
mimic_direct_extract.py:303: FutureWarning: Method .as_matrix will be removed in a future version. Use .values instead.
np.save(os.path.join(outPath, subjects_filename), data['subject_id'].as_matrix())
mimic_direct_extract.py:305: FutureWarning: Method .as_matrix will be removed in a future version. Use .values instead.
np.save(os.path.join(outPath, times_filename), data['max_hours'].as_matrix())
mimic_direct_extract.py:324: FutureWarning: Method .as_matrix will be removed in a future version. Use .values instead.
if dynamic_filename is not None: np.save(os.path.join(outPath, dynamic_filename), X.as_matrix())
/root/miniconda3/envs/mimic_data_extraction/lib/python3.6/site-packages/tables/attributeset.py:475: NaturalNameWarning: object name is not a valid Python identifier: 'axis0_nameAggregation Function'; it does not match the pattern ``^[a-zA-Z_][a-zA-Z0-9_]*$``; you will not be able to use natural naming to access this object; using ``getattr()`` will still work, though
check_attribute_name(name)
Traceback (most recent call last):
File "mimic_direct_extract.py", line 922, in <module>
min_percent=args['min_percent']
File "mimic_direct_extract.py", line 325, in save_numerics
if dynamic_hd5_filename is not None: X.to_hdf(os.path.join(outPath, dynamic_hd5_filename), 'X')
File "/root/miniconda3/envs/mimic_data_extraction/lib/python3.6/site-packages/pandas/core/generic.py", line 2377, in to_hdf
return pytables.to_hdf(path_or_buf, key, self, **kwargs)
File "/root/miniconda3/envs/mimic_data_extraction/lib/python3.6/site-packages/pandas/io/pytables.py", line 274, in to_hdf
f(store)
File "/root/miniconda3/envs/mimic_data_extraction/lib/python3.6/site-packages/pandas/io/pytables.py", line 268, in <lambda>
f = lambda store: store.put(key, value, **kwargs)
File "/root/miniconda3/envs/mimic_data_extraction/lib/python3.6/site-packages/pandas/io/pytables.py", line 889, in put
self._write_to_group(key, value, append=append, **kwargs)
File "/root/miniconda3/envs/mimic_data_extraction/lib/python3.6/site-packages/pandas/io/pytables.py", line 1415, in _write_to_group
s.write(obj=value, append=append, complib=complib, **kwargs)
File "/root/miniconda3/envs/mimic_data_extraction/lib/python3.6/site-packages/pandas/io/pytables.py", line 3022, in write
blk.values, items=blk_items)
File "/root/miniconda3/envs/mimic_data_extraction/lib/python3.6/site-packages/pandas/io/pytables.py", line 2812, in write_array
self._handle.create_array(self.group, key, value)
File "/root/miniconda3/envs/mimic_data_extraction/lib/python3.6/site-packages/tables/file.py", line 1168, in create_array
track_times=track_times)
File "/root/miniconda3/envs/mimic_data_extraction/lib/python3.6/site-packages/tables/array.py", line 197, in __init__
byteorder, _log, track_times)
File "/root/miniconda3/envs/mimic_data_extraction/lib/python3.6/site-packages/tables/leaf.py", line 290, in __init__
super(Leaf, self).__init__(parentnode, name, _log)
File "/root/miniconda3/envs/mimic_data_extraction/lib/python3.6/site-packages/tables/node.py", line 266, in __init__
self._v_objectid = self._g_create()
File "/root/miniconda3/envs/mimic_data_extraction/lib/python3.6/site-packages/tables/array.py", line 229, in _g_create
nparr, self._v_new_title, self.atom)
File "tables/hdf5extension.pyx", line 1297, in tables.hdf5extension.Array._create_array
tables.exceptions.HDF5ExtError: Problems creating the Array.
Job 'python mimic_direct_extract.py ...' terminated by signal SIGSEGV (Address boundary error)
I think I've had some fuss with pyarrow/pandas versions before as well as python 2 vs 3. Is your environment the same as the requirements?
Yes, the environment is the same as the requirements. I think it is just a matter of Docker. It worked when I was following the exact same steps outside of the Docker container.