UCBerkeleySETI/turbo_seti

find_event.py crash at line 417

Closed this issue · 1 comments

This is a data-dependent issue.

To Reproduce
From http://blpd14.ssl.berkeley.edu/voyager_2020/single_coarse_channel/, download only the *.0000.h5 files.
Run the Python program below.

Source Code

from shutil import rmtree
from os import mkdir
import logging
from turbo_seti.find_doppler.find_doppler import FindDoppler
from turbo_seti.find_event.find_event_pipeline import find_event_pipeline

H5DIR = '/seti_data/voyager_2020/'
OUTDIR = H5DIR + 'outdir/'
PATH_DAT_LIST_FILE = OUTDIR + 'new_dat_files.lst'
PATH_CSVF = OUTDIR + 'found_event_table.csv'

voyager_list = ['single_coarse_guppi_59046_80036_DIAG_VOYAGER-1_0011.rawspec.0000.h5',
                'single_coarse_guppi_59046_80354_DIAG_VOYAGER-1_0012.rawspec.0000.h5',
                'single_coarse_guppi_59046_80672_DIAG_VOYAGER-1_0013.rawspec.0000.h5',
                'single_coarse_guppi_59046_80989_DIAG_VOYAGER-1_0014.rawspec.0000.h5',
                'single_coarse_guppi_59046_81310_DIAG_VOYAGER-1_0015.rawspec.0000.h5',
                'single_coarse_guppi_59046_81628_DIAG_VOYAGER-1_0016.rawspec.0000.h5']

def make_dat_files():
    ii = 0
    with open(PATH_DAT_LIST_FILE, 'w') as file_handle:
        for filename in voyager_list:
            path_h5 = H5DIR + filename
            doppler = FindDoppler(path_h5,
                              max_drift = 4,
                              snr = 10,
                              log_level_int=logging.WARNING,
                              out_dir = H5DIR
                              )
            doppler.search()
            ii += 1
            path_dat = H5DIR + filename.replace('.h5', '.dat')
            file_handle.write('{}\n'.format(path_dat))
            print("make_dat_files: {} - finished making DAT file for {}".format(ii, path_h5))


# Initialize output directory
rmtree(OUTDIR, ignore_errors=True)
mkdir(OUTDIR)

# Make the DAT files
make_dat_files()

# Generate CSV file from find_event_pipeline()
num_in_cadence = len(voyager_list)
find_event_pipeline(PATH_DAT_LIST_FILE,
                    filter_threshold = 3,
                    number_in_cadence = num_in_cadence,
                    user_validation=False,
                    saving=True,
                    csv_name=PATH_CSVF)
print("Produced {}".format(PATH_CSVF))

Console Output

make_dat_files: 1 - finished making DAT file for /seti_data/voyager_2020/single_coarse_guppi_59046_80036_DIAG_VOYAGER-1_0011.rawspec.0000.h5
make_dat_files: 2 - finished making DAT file for /seti_data/voyager_2020/single_coarse_guppi_59046_80354_DIAG_VOYAGER-1_0012.rawspec.0000.h5
make_dat_files: 3 - finished making DAT file for /seti_data/voyager_2020/single_coarse_guppi_59046_80672_DIAG_VOYAGER-1_0013.rawspec.0000.h5
make_dat_files: 4 - finished making DAT file for /seti_data/voyager_2020/single_coarse_guppi_59046_80989_DIAG_VOYAGER-1_0014.rawspec.0000.h5
make_dat_files: 5 - finished making DAT file for /seti_data/voyager_2020/single_coarse_guppi_59046_81310_DIAG_VOYAGER-1_0015.rawspec.0000.h5
make_dat_files: 6 - finished making DAT file for /seti_data/voyager_2020/single_coarse_guppi_59046_81628_DIAG_VOYAGER-1_0016.rawspec.0000.h5

************   BEGINNING FIND_EVENT PIPELINE   **************

Assuming the first observation is an ON
find_event_pipeline: source_name = VOYAGER-1
find_event_pipeline: source_name = VOYAGER-1
find_event_pipeline: source_name = VOYAGER-1
find_event_pipeline: source_name = VOYAGER-1
find_event_pipeline: source_name = VOYAGER-1
find_event_pipeline: source_name = VOYAGER-1
There are 6 total files in the filelist /seti_data/voyager_2020/outdir/new_dat_files.lst
therefore, looking for events in 1 on-off set(s)
with a minimum SNR of 10
Present in all A sources with RFI rejection from the off-sources
not including signals with zero drift
saving the output files

***       59046       ***

------   o   -------
Loading data...
Loaded 3 hits from /seti_data/voyager_2020/single_coarse_guppi_59046_80036_DIAG_VOYAGER-1_0011.rawspec.0000.dat (ON)
Loaded 0 hits from /seti_data/voyager_2020/single_coarse_guppi_59046_80354_DIAG_VOYAGER-1_0012.rawspec.0000.dat (OFF)
Loaded 3 hits from /seti_data/voyager_2020/single_coarse_guppi_59046_80672_DIAG_VOYAGER-1_0013.rawspec.0000.dat (ON)
Loaded 0 hits from /seti_data/voyager_2020/single_coarse_guppi_59046_80989_DIAG_VOYAGER-1_0014.rawspec.0000.dat (OFF)
Loaded 3 hits from /seti_data/voyager_2020/single_coarse_guppi_59046_81310_DIAG_VOYAGER-1_0015.rawspec.0000.dat (ON)
Loaded 0 hits from /seti_data/voyager_2020/single_coarse_guppi_59046_81628_DIAG_VOYAGER-1_0016.rawspec.0000.dat (OFF)
All data loaded!

Finding events in this cadence...
Found a total of 9 hits above the SNR cut in this cadence!
Traceback (most recent call last):

  File "/mnt/elkdata/linux-home-folder/seti_testing/turbo_seti_testing/issue_145.py", line 46, in <module>
    find_event_pipeline(PATH_DAT_LIST_FILE,

  File "/home/elkins/anaconda3/lib/python3.8/site-packages/turbo_seti-2.1.0-py3.8.egg/turbo_seti/find_event/find_event_pipeline.py", line 205, in find_event_pipeline
    cand = find_event.find_events(file_sublist,

  File "/home/elkins/anaconda3/lib/python3.8/site-packages/turbo_seti-2.1.0-py3.8.egg/turbo_seti/find_event/find_event.py", line 417, in find_events
    snr_adjusted_table['RFI_in_range'] = snr_adjusted_table.apply(lambda hit: len(off_table[((off_table['Freq'] > calc_freq_range(hit)[0]) & (off_table['Freq'] < calc_freq_range(hit)[1]))]),axis=1)

  File "/home/elkins/.local/lib/python3.8/site-packages/pandas/core/frame.py", line 7765, in apply
    return op.get_result()

  File "/home/elkins/.local/lib/python3.8/site-packages/pandas/core/apply.py", line 185, in get_result
    return self.apply_standard()

  File "/home/elkins/.local/lib/python3.8/site-packages/pandas/core/apply.py", line 276, in apply_standard
    results, res_index = self.apply_series_generator()

  File "/home/elkins/.local/lib/python3.8/site-packages/pandas/core/apply.py", line 290, in apply_series_generator
    results[i] = self.f(v)

  File "/home/elkins/anaconda3/lib/python3.8/site-packages/turbo_seti-2.1.0-py3.8.egg/turbo_seti/find_event/find_event.py", line 417, in <lambda>
    snr_adjusted_table['RFI_in_range'] = snr_adjusted_table.apply(lambda hit: len(off_table[((off_table['Freq'] > calc_freq_range(hit)[0]) & (off_table['Freq'] < calc_freq_range(hit)[1]))]),axis=1)

TypeError: list indices must be integers or slices, not str

Starting at line 415 in find_eveny.py, I have successfully tested these changes:

    #Now find how much RFI is within a frequency range of the hit 
    #by comparing the ON to the OFF observations. Update RFI_in_range
    if len(off_table) == 0:
        print('Length of off_table = 0')
        snr_adjusted_table['RFI_in_range'] = 0
    else:
        snr_adjusted_table['RFI_in_range'] = snr_adjusted_table.apply(
            lambda hit: 
                len(off_table[((off_table['Freq'] > calc_freq_range(hit)[0]) 
                               & (off_table['Freq'] < calc_freq_range(hit)[1])
                               )]), axis=1)