UCBerkeleySETI/turbo_seti

Drift Rate threshold in plotSETI

pranavp25 opened this issue · 5 comments

plotSETI as it stands only has a SNR threshold limit.

Can we add a command line argument that could be used to set a drift rate threshold for plotSETI?
At the ATA we have an automated post-processor that runs turboseti directly on the data and produces a .h5 and .dat file.

The problem is that we're ending up with thousands of candidates with drift rates very close to 0, and hence are producing thousands of pngs. Therefore, it would be great if we could have a drift rate threshold while running plotSETI directly on the data, so as to filter the output of turboseti better.

In turboSETI, you already have dedoppler filtering parameters min_drift, max_drift, and snr (threshold). Why are you not using this capability to exclude hits with near 0 drift rates?

Is this enhancement request for a second bite at the apple?

Note that if you supplied all 3 filtering parameters in turboSETI, it would tend to run faster and with smaller output .dat files as unqualified hits are thrown out early in the dedoppler logic.

@texadactyl agreed. Although we can run turboSETI with a different parameter set, it would be nice to have the ability for the plotter to also select the candidates it chooses to plot. We clearly won't be looking at near-zero drift rate candidates, but we need to inspect candidates indiscriminately before choosing the exact turboSETI input thresholds.

@wfarah @pranavp25

Someone else did that with the SNR threshold ("SNR_cut") before my time in find_event_pipeline and that is why an SNR threshold parameter is exposed in plotSETI. If I had my way, I would have removed that !@#$%^!! parameter but I restrained myself in the interest of not causing a surprise to anyone.

Two possible approaches to give you a second bite of the filtering apple:

  1. Hack up find_event_pipeline and plotSETI to accept all 3 dedoppler filtering parameters with defaults of None.
  2. Create a new utility (dat_filter) which uses 1-3 of the dedoppler filtering parameters to trim a given .dat file.

The advantage of #1 is that I could fix the SNR threshold parameter to default to None so as to accept all values in the .dat file as-is. Currently, one must supply a value or default to 10.0 which may or may not make sense, given the values in the .dat file. I saw only worst code when I was 20 writing Assembly Language code.

The advantage of #2 is that it is considerably easier to test and it would be a fast execution to fix your existing .dat files.

What's your preference: #1 or #2. You probably would like both with #2 first.

dat_filter -h

usage: dat_filter [-h] [-s MIN_SNR] [-m MIN_DRIFT_RATE] [-M MAX_DRIFT_RATE] dat_file

dat_filter - prune a .dat file.

positional arguments:
  dat_file              Path of the .dat file to prune

optional arguments:
  -h, --help            show this help message and exit
  -s MIN_SNR, --min_snr MIN_SNR
                        Filter parameter: The SNR below which top hits will be discarded.
  -m MIN_DRIFT_RATE, --min_drift_rate MIN_DRIFT_RATE
                        Filter parameter: TThe drift rate below which top hits will be discarded.
  -M MAX_DRIFT_RATE, --max_drift_rate MAX_DRIFT_RATE
                        Filter parameter: TThe drift rate above which top hits will be discarded.

Read a .dat file.  Use the following filtering parameters to prune it,
saving the original content in .dat.original:
    * min_drift_rate (Hz/s)
    * max_drift_rate (Hz/s)
    * min_snr

Exit status:
    0 : All went well, even if 0 top hits were read or retained.
    1 : Some sort of error was reported.