PimvanderLoos/AnimatedArchitecture

Track Problematic Shutdowns

Closed this issue · 0 comments

The Problem

For debugging purposes, it is helpful to know the history of problematic shutdowns such as server crashes. A 'problematic shutdown' in this context refers to any situation where the plugin shuts down without completing the shutdown process.

One example of an issue that may arise after a problematic shutdown is orphaned blocks, as further described in issue #593.

The Implementation

For the implementation, I propose to let the DatabaseManager do most of the heavy lifting for the following reasons:

  • This is the only class with direct access to the database.
  • It is a registered IRestartable, meaning its initialization and shutdown methods will be called when the plugin is initialized/shut down. Specifically, these methods will be called whenever the plugin's onEnable or onDisable methods are called.
  • It is a very 'central' class; It has very few dependencies on its own (none of which are restartable) and it is a dependency of many other classes. In fact, it is the third restartable to be registered at the time of writing (preceded by only the config and the LocalizationManager). On the other hand, it is a dependency of 15 other classes, some of which are restartable.
    The RestartableHolder respects the registration order of the restartables for initialization and shutdown. This means that it calls IRestartable#initialize() for all registered restartables in the order of registration and IRestartable#shutDown() in reverse order. This means that any class that depends on the DatabaseManager will be initialized after the DatabaseManager and shut down before it. In other words, the DatabaseManager is one of the last restartables to be shut down.

The idea is to create three new tables in the database:

  • A table containing only a value to keep track of the current initialization state.
  • A table containing only a value to store the timestamp of the initial creation of the database.
  • A table to keep track of the history of problematic shutdowns.

The table storing the initial creation and the current initialization state may be a single 'metadata' table.

Whenever DatabaseManager#shutDown() is called, the DatabaseManager should tell the database to set the initialization state to false after shutting down the thread pool.

Whenever DatabaseManager#initialize() method is called, the DatabaseManager should tell the database to update the initialization to true and return the previous initialization state before creating a new thread pool. If the last state initialization is not false, we can infer that the previous shutdown of the plugin was problematic.

After detecting a problematic shutdown, the DatabaseManager should instruct the database to store the timestamp at the time of detection. While this is not the time of the problematic shutdown itself, it should be close enough, as any time between the actual problematic shutdown and the subsequent startup is 'dead' time during which nothing can happen regardless.

The database will need several new methods:

  • A method to get-and-set the initialization state. This should return the previous initialization state to allow for further processing.
  • A method to check if any problematic restarts occurred between two timestamps (and a default method for between a given timestamp and 'now'). This should return an enum to denote any of the following states:
    • No problematic restarts
    • At least one problematic restart
    • Before database creation (so we can track issues where a user deleted the database)
  • Methods to retrieve database creation date, problematic restart history, and the most recent problematic restart

Special Considerations

  • Currently, there is no way to gather debug data asynchronously. This means we cannot get any data/statistics from the database for the debug report. Until that changes, the DatabaseManager should store

    • Whether the very first initialization was preceded by a problematic shutdown
    • Whether any initialization during the current run was preceded by a problematic shutdown

    These values are likely going to be the same in most cases, but the second one could catch any issues with problematic shutdowns from plugin restarts. While these should never occur, it's nice to catch them regardless.
    More debug items may be added if any come up that could provide useful data.