usgs/slab2

KeyError: "['D1', 'mlon', 'time', 'R1', 'D2', 'mlat', 'S1', 'S2', 'mdep', 'mag', 'R2'] not in index" in slab2.py main line 413 resulting from incomplete input file

mlbombardier opened this issue · 2 comments

This error occurs while creating a slab model for Cascadia. The following is the process to recreate the issue.

Copy all CSVs from the /slab2_master/MasterDB directories beginning with 'cas' to a new /slab2_master/0521database directory

cd to /slab2_master/slab2code
Replace occurrences of vincenty with geodesic from geodis.distance

conda activate slab2env

python s2d.py -p cas -d ../0521database -f cas_05-21_input.csv

mv cas_05-21_input.csv Input/

python slab2.py -p library/parameterfiles/casinput.par

See full output below:

Start Section 1 of 7: Setup
Loading inputs...
.DS_Store
exp_04-18_input.csv
cas_05-21_input.csv
.gitignore
using input file: Input/cas_05-21_input.csv
mkdir: Output/cas_slab2_05.08.21: File exists
/usr/local/anaconda3/envs/slab2env/lib/python3.7/site-packages/numpy/core/_asarray.py:102: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
return array(a, dtype, copy=False, order=order)
slab guide for this model: <GMTGrid Object:
ny: 281
nx: 241
xmin: -130.0000
xmax: -118.0000
ymin: 37.0000
ymax: 51.0000
dx: 0.0500
dy: 0.0500
zmin: -417.520630
zmax: -2.823972>
Traceback (most recent call last):
File "slab2.py", line 1833, in
main(pargs)
File "slab2.py", line 413, in main
eventlist = eventlistALL[kagancols]
File "/usr/local/anaconda3/envs/slab2env/lib/python3.7/site-packages/pandas/core/frame.py", line 3030, in getitem
indexer = self.loc._get_listlike_indexer(key, axis=1, raise_missing=True)[1]
File "/usr/local/anaconda3/envs/slab2env/lib/python3.7/site-packages/pandas/core/indexing.py", line 1266, in _get_listlike_indexer
self._validate_read_indexer(keyarr, indexer, axis, raise_missing=raise_missing)
File "/usr/local/anaconda3/envs/slab2env/lib/python3.7/site-packages/pandas/core/indexing.py", line 1316, in _validate_read_indexer
raise KeyError(f"{not_found} not in index")
KeyError: "['D1', 'mlon', 'time', 'R1', 'D2', 'mlat', 'S1', 'S2', 'mdep', 'mag', 'R2'] not in index"

Explanation:
This error was replicated for the 'alu' region.
Input files made with s2d.py do not contain the columns corresponding to the above key errors, hence they are not indexed. This error does not occur when using the 'cas' input file made by makeallinputs.py (I have not checked other regions).

Hi @mlbombardier! Thank you for submitting an issue. It sounds like you are trying to create a new input file for the cas slab region, correct?

I am happy you replaced occurrences of vincenty with geodesic from geodis.distance! I pushed up that change last week, so you can get the most up-to-date version of the code.

I was able to reproduce your issue! You copied over all the input files from MasterSB starting with "cas", but you did not copy over any earthquake hypocenter data (from GCMT or ComCat) from either the gcmt_pde_assc or the PublishedSlab2DB directory. In those directories, you will find ALL_EQ_111120.csv or ALL_EQ_122717.csv, respectively. One of those csv files is required. ALL_EQ_111120.csv contains hypocenters from GCMT and the ComCat catalogs until November 2020 whereas ALL_EQ_122717.csv contains hypocenters until December 2017. The 2017 csv file was used to make the published Slab2 models, and so that database is best to use when re-creating those models.

Please note that if you are not using any new seismic data, you can simply create an input file from one of the existing databases (0819databse (published), 1219database, or the 1120database).

Can you please confirm that adding either ALL_EQ_111120.csv or ALL_EQ_122717.csv into your 0521database directory and re-making the input and running slab2.py fixes your issue?

In the meantime, I will update the documentation to better explain what files must be included in any new database directory.

Thank you!

@khaynie-usgs That does resolve the issue, thanks!