seeing-things/track

Investigate synchronous telemetry polling

Closed this issue · 4 comments

The existing telemetry system uses an asynchronous model where channels are polled from TelemSource objects from a separate thread. This mostly works but has a few downsides:

  • Requires use of mutexes in TelemSource classes and careful copying of data to ensure that when telemetry is eventually polled that it constitutes a truly atomic snapshot of the object state. For example, it would be undesirable if the channels from a camera object were polled just after the X coordinate of a camera position was updated, but just before the corresponding Y coordinate was updated.
  • Telemetry timestamps are not super accurate since they are generated at the time telemetry value is sampled from the object. There is no way for the object being polled to provide a more accurate timestamp for a given telemetry value. (Unlike the other items, this could be improved without changing to a synchronous model. This work is scoped in #212.)
  • There is not a 1:1 mapping between telemetry samples and control cycles because the two processes run asynchronously. A particular control cycle could be sampled once, two or more times, or it could be missed entirely. This makes telemetry more difficult to interpret and probably increases the jitter on the control cycle period.

For the main track executable it seems like there would be significant value in generating telemetry synchronously with the control loop. Investigate the feasibility of this.

Some thoughts on how the design of this would work.

Presently, all classes that inherit from TelemSource are expected to provide an implementation of get_telem_channels() which returns a dict where keys are channel names and values are the values of type int or float. After one of these objects is constructed, a reference to it (along with references to other TelemSource objects) is passed to the constructor of a TelemLogger object. After start() has been called on the logger, it will call get_telem_channels() on each of the TelemSource objects it has references to in a dedicated thread. Thus get_telem_channels() is called asynchronous to activity in any other threads in the program.

A synchronous design might look as follows:

  1. TelemLogger object is constructed first. It has a public method post_points().
  2. A class that has telemetry to post takes a reference to a TelemLogger object as a constructor argument.
  3. At runtime, any objects with references to a TelemLogger object may call post_points() at any time.

One open question in this synchronous design is whether the Tracker object should be the only class to hold a reference to the TelemLogger (and also the only object to be calling post_points()), or if multiple objects in the program (Targets, Cameras, TelescopeMounts, etc) should all have references and post their own telemetry channels. Some considerations:

  • If each class calls post_points() separately, I will need to either:
    • Provide a new public method on each class that informs it when to call post_points(), or
    • I will need to pick some other existing public method of the class and add the call to post_points() to that method. This could have significant performance downsides if the only public methods on the class are called too frequently, or this could result in sparse telemetry if the only public methods are called too infrequently.
  • If the Tracker object is responsible for posting all of the channels then it will need to either
    • Reach down into several other objects to extract the desired information. This is not great since it increases coupling between innards of multiple classes.
    • Use the existing get_telem_channels() API on each object. This seems like it may be the cleanest option. This would also probably reduce the amount of refactoring required in classes other than Tracker.

Another important question is whether this synchronous mode of operation will work for programs that don't use the Tracker class. The only program that does not use a Tracker is the gamepad_control program. For that program I think there are a few options for providing equivalent functionality:

  • Call a function in the while True loop of the program to post new telemetry if the elapsed time since the last call to that function exceeds some threshold that is the polling period
  • Create a separate thread in the program that polls the Gamepad class at the desired interval. This approach may require that the Gamepad class retain an API similar to what is present now.
  • TelemLogger retains the ability to poll TelemSource objects asynchronously.
    • I'm a little wary of this option because it could make TelemLogger confusing.

Another idea for implementing synchronous telemetry polling is to keep more of the infrastructure identical to what exists now. Classes still inherent from TelemSource and references to such objects still get registered with the TelemLogger object. The only difference is that the actual polling events are triggered by the Tracker object calling a method (gather()? or maybe poll_all_sources()?) on the TelemLogger object once per control cycle, rather than being triggered by a thread within that object. The TelemPoller class would only need a few minor changes and could still support asynchronous polling with a thread as an optional feature.

This has been implemented and tested by running the align program. It appears to be working as expected. I chose to use the approach described in my most recent comment, which preserves the asynchronous polling functionality in the TelemLogger class and did in fact require only minimal changes to any code. For now, in the Tracker object I opted to leave the mutexes and other code related to thread synchronization intact, but those could be removed later.