caribpa/partgrator

Next Major Release Code Rework

caribpa opened this issue · 0 comments

To allow multiple main partgrator partitions, as well as multiple partgrator partitions, to support main partgrator partitions of different sizes in the same device, the main logic for processing partitions in partgrator-post needs to change.

The reason for this is because the original design was built based on a somewhat wrong understanding regarding when the pre-mount and post-mount scripts are called when a partition is detected by the system, and, because of that, some unnecessary mechanisms were implemented. Basically, this misunderstanding comes from the fact that one can trigger the auto-mount mechanism of the system for all devices with the command udevtrigger --subsystem-match=block from inside pre-mount and/or post-mount, resulting in two instances of these scripts being executed simultaneously with different partitions. The fact is that this behavior doesn't happen when a device is normally detected, for example: plugging a two-partition USB will result in pre-mount being executed first (it is not guaranteed that partition one is the first partition to run the scripts), then in almost all cases post-mount is executed but it also may happen that the other partition runs pre-mount right after the first partition finished its pre-mount instance but before it runs post-mount. What cannot happen is that both partitions are running pre-mount or post-mount simultaneously.

As a result, the functions recover_partitions() and migrate_partitions() use mechanisms in their loops to try to process partitions that are being detected simultaneously (running pre-mount), even though the original idea behind these loops was to read $partitions_status once in memory and process its lines with the goal of minimizing reads to the file.

As partgrator grew, more abstractions were needed and now they can facilitate the changes below:

  • partgrator partitions no longer need to be added into the $partitions_status file (partgrator-pre), and also they instantly exit partgrator-pre and partgrator-post after checking that they don't contain a .partgrator file in its root (otherwise start the recovery)
  • recover_partitions() and migrate_partitions() no longer need to loop through the content of $partitions_status
  • The main mechanism for detecting the state and processing a corrupted main partgrator partition in partgrator-post is:
    • Check that the root folder contains a .partgrator file, exit if not (they are main non-partgrator partitions). This condition can be the same than the one depicted in the very first point about partgrator partitions
    • Find the partition of the mountpoint passed as parameter (use get_mountpoint_partition())
    • Retrieve the error code and the label from $partitions_status, exit if not found and issue a warning (according to the first step, they are main partgrator partitions)
    • Loop through every partition (other than the current one) looking for either one with the same label (and start the duplicated label recovery), or a _partgrator label (or whatever the variable $partgrator_label is set to) using get_partition_label(), and check that it is the same size of the current one (use blockdev --getsize64). Except for the size check and duplicated labels, this is essentially what experimental_reprocess_partitions() does
    • Continue with the recover_partitions() and migrate_partitions() functions to process and migrate the partition of the mountpoint passed as parameter, if needed

It still needs to be checked whether a migration is in process or not (and exit if so) to prevent another partition from migrating simultaneously if reload_disks() is called from the first partgrator-post before finishing.

This procedure also solves the limitation of mounting corrupted partitions because they need to wait until a partgrator partition is listed in the $partitions_status to perform any procedure.

Unfortunately, the procedure cannot finish early if the error code found in $partitions_status is clean because there's a chance the migration was interrupted and there are duplicated labels. The loop can break early when finding either a partgrator partition (same size as the one being processed) or a partition with duplicated label (no need to check for duplicates as there cannot be other partitions with the same label).