chime-experiment/dias

Feed positions analyser

Opened this issue · 2 comments

jrs65 commented

A brief summary of changes that would be good to have for the feed positions analyser:

Grafana:

  • the singlestats should show only the last 24 hours, but seem to vary depending on the time range selected.
  • it would be nice to have a good/bad stat that summarise whether the data is good or not.

Analyzer:

  • at the moment the good/bad thresholds are actually hardcoded in the analyzer. This is a policy decision that's probably better at a higher layer. We should expose relevant statistics into prometheus and let those decisions be made higher up. The key question is what are the statistics to export, possibly num_feeds_above_threshold{threshold="1.0",freq_id="14"}, i.e. we pick a few (maybe ~10) distance thresholds and just return the number above each.

Analysis:

There's a few questions we need to address from the analysis:

  • when is a feed good/bad?
  • when is the instrument good/bad? What statistic do we test, and what threshold should we use?
  • look back at historical data to see if/when/how this analyser is catching periods of bad data? There are certain failures it should be immune from (i.e. calibration/flagging), but it should catching rain event jumps, decorrelated cylinders, massive RFI issues etc.
jrs65 commented

@mondana anything else we want on here?

@jrs65 I substituted the percentage of bad feeds metric with a num_feeds_above_threshold metric which has labels frequency, source and threshold. I chose threshold levels 2.0, 3.0, 4.0, 5.0, 7.0, 10.0. The unit is a foot (one feed separation)

Feeds that are flagged by flagging broker are excluded when calculating the number of bad feed positions above these thresholds. This should make the analyzer immune against flagged bad feeds.

If lets say eigenvalue on versus off source is greater than some threshold, then the analyzer will flag the data for that specific frequency. This was intended to serve as a test for RFI issues in a frequency bin.

Since it's using the chemical dataset - I think the only thing that needs to have worked is the eigen decomposition of the visibility matrix at source transit. If understand that remark correctly, the feed position does not rely on derived gains/ or calibration.

As for rain jumps and decorrelated cylinders ... I have no idea at this point.