Issue in the turbo_seti tutorial
Closed this issue · 9 comments
Hi,
I encountered this issue when running the turbo_seti tutorial Jupyter notebook. When trying to run the following part of the code:
find_event_pipeline('dat_files.lst',
...: filter_threshold = 3,
...: number_in_cadence = 6
...: )
I get the following error "list index out of range", which refers to these lines (more specifically line 76):
TopHitNum = list(zip(*all_hits))[0]
DriftRate = [float(df) for df in list(zip(*all_hits))[1]]
SNR = [float(ss) for ss in list(zip(*all_hits))[2]]
Freq = [float(ff) for ff in list(zip(*all_hits))[3]]
I'm running turbo_seti on Python 3.6.
Thanks!
What versions of blimpy and turbo_seti are you using? Both have been updated in the last month.
Hi. I'm using freshly installed blimpy (2.0.2) and turbo_seti (2.0.0). I've installed them following the instructions in the tutorial.
Also, in case it is an issue with the package versions, here is the output of my conda list:
appnope 0.1.2 pypi_0 pypi
astropy 4.1 pypi_0 pypi
backcall 0.2.0 pypi_0 pypi
blimpy 2.0.2 pypi_0 pypi
ca-certificates 2020.12.8 hecd8cb5_0
cached-property 1.5.2 pypi_0 pypi
certifi 2020.12.5 py36hecd8cb5_0
cloudpickle 1.6.0 pypi_0 pypi
cycler 0.10.0 pypi_0 pypi
cython 0.29.21 pypi_0 pypi
dask 2.30.0 pypi_0 pypi
decorator 4.4.2 pypi_0 pypi
fsspec 0.8.4 pypi_0 pypi
h5py 3.1.0 pypi_0 pypi
hdf5plugin 2.3.1 pypi_0 pypi
ipython 7.16.1 pypi_0 pypi
ipython-genutils 0.2.0 pypi_0 pypi
jedi 0.17.2 pypi_0 pypi
kiwisolver 1.3.1 pypi_0 pypi
libcxx 10.0.0 1
libedit 3.1.20191231 h1de35cc_1
libffi 3.3 hb1e8313_2
llvmlite 0.35.0 pypi_0 pypi
locket 0.2.0 pypi_0 pypi
matplotlib 3.3.3 pypi_0 pypi
ncurses 6.2 h0a44026_1
numba 0.52.0 pypi_0 pypi
numpy 1.19.4 pypi_0 pypi
openssl 1.1.1i h9ed2024_0
pandas 1.1.5 pypi_0 pypi
parso 0.7.1 pypi_0 pypi
partd 1.1.0 pypi_0 pypi
pexpect 4.8.0 pypi_0 pypi
pickleshare 0.7.5 pypi_0 pypi
pillow 8.0.1 pypi_0 pypi
pip 20.3.1 py36hecd8cb5_0
prompt-toolkit 3.0.8 pypi_0 pypi
ptyprocess 0.6.0 pypi_0 pypi
pygments 2.7.3 pypi_0 pypi
pyparsing 2.4.7 pypi_0 pypi
python 3.6.12 h26836e1_2
python-dateutil 2.8.1 pypi_0 pypi
pytz 2020.4 pypi_0 pypi
pyyaml 5.3.1 pypi_0 pypi
readline 8.0 h1de35cc_0
scipy 1.5.4 pypi_0 pypi
setuptools 51.0.0 py36hecd8cb5_2
six 1.15.0 pypi_0 pypi
sqlite 3.33.0 hffcf06c_0
tk 8.6.10 hb0a8c7a_0
toolz 0.11.1 pypi_0 pypi
traitlets 4.3.3 pypi_0 pypi
turbo-seti 2.0.0 pypi_0 pypi
wcwidth 0.2.5 pypi_0 pypi
wheel 0.36.1 pyhd3eb1b0_0
xz 5.2.5 h1de35cc_0
zlib 1.2.11 h1de35cc_3
Looking at this now.
In the future, here is a downloader for the H5 files. If needed, you can modify the DATADIR to point to the directory of your choice.
from time import time
import sys
from shutil import rmtree
from os import mkdir
from urllib.error import HTTPError
import wget
DATADIR = '/tmp/turbo_seti_data/' # <--------- H5, DAT, LOG directory
URL_DIR = 'http://blpd0.ssl.berkeley.edu/parkes_testing/'
H5_FILE_LIST = ['diced_Parkes_57941_12846_HIP33499_S_fine.h5',
'diced_Parkes_57941_13194_HIP33499_R_fine.h5',
'diced_Parkes_57941_13542_HIP33499_S_fine.h5',
'diced_Parkes_57941_13884_HIP33499_R_fine.h5',
'diced_Parkes_57941_14233_HIP33499_S_fine.h5',
'diced_Parkes_57941_14584_HIP33499_R_fine.h5']
def oops(arg_text):
'''
Log the bad news and exit to the O/S with a non-zero exit code.
'''
print('\n*** Oops, ' + arg_text)
sys.exit(86)
def wgetter(arg_h5_name):
'''
wget an HDF5 file from the Internet repository.
arg_h5_name: HDF5 file name
'''
url_h5 = URL_DIR + arg_h5_name
path_h5 = DATADIR + arg_h5_name
print('wgetter: Begin wget {} -> {} .....'.format(url_h5, path_h5))
time_start = time()
try:
wget.download(url_h5, path_h5, bar=False)
except HTTPError as ex:
oops('wgetter: wget {}, failed: {}'.format(url_h5, repr(ex)))
time_stop = time()
print('wgetter: End wget ({}), et = {:.1f} seconds'
.format(arg_h5_name, time_stop - time_start))
rmtree(DATADIR, ignore_errors=True)
mkdir(DATADIR)
for filename_h5 in H5_FILE_LIST:
wgetter(filename_h5)
I have reproduced your experience, "list index out of range". This was working and it now seems to have an issue. I'll need more time to investigate.
Running pytest
on the whole repo. This runs what should be the equivalent of the tutorial as part of the test suite.
Found it. The test/test_pipelines.py program avoids this issue because it cleans up the output directory before every FindDoppler.search() execution. Unfortunately, the tutorial failed to do so. Yes, I am surprised too.
Idiosyncracy in turbo_seti data_handler.py: it always opens the output DAT file for APPEND (why??? (-:). Note that in the tutorial, as a consequence, the first DAT file useful contents is written TWICE in the same DAT file: once in step (4) and the second time in step (11) when it processes all of the input HDF5 files.
Given this peculiarity of data_handler.py, there needs to be some code (just like in test_pipelines.py) to remove the temporary DAT from step (4).
I will fix this now.
And while debugging the tutorial, I found a bug in turbo_seti/find_event/plot_event_pipeline.py !!
Fixing both.
@AndriusT
Please have a go at the new tutorial ipynb file.
Hi,
Can now confirm that it works. Thanks!