RFE: Add volume based filesystem objects (btrfs, bcachefs)
Opened this issue · 5 comments
UDisks supports Btrfs through a btrfs filesystem object in Cockpit we noticed
this isn't enough abstraction to easily work and represent btrfs (and maybe
other volume based fs as well such as bcachefs).
In Cockpit we decided to represent btrfs similiar to LVM from an UI perspective
-> btrfs subvolume
-> btrfs volume (the filesystem)
-> btrfs device (backing storage)
-> block device
To represent the usage per device in a multi device setup we currently parse the output of btrfs filesystem show $uuid
, as nothing in UDisks exists to represent a "btrfs device".
To show the mount points per subvolume (outside of btrfs) we parse findmnt --btrfs --json
to detect these. This might be a very Cockpit specific issue.
In general we need to keep a lot of global state for keeping a list of blocks per volume, etc.
This seems like a good base for UDisks objects, hopefully re-usable with bcachefs:
design proposal
org.freedesktop.UDisks2.Filesystem.BTRFSSubvolume
Represents a btrfs subvolume from btrfs subvolume list $mountpoint
Methods:
- Delete
- Mount
- Unmount
Properties
- Volume
- path
- level
- id
- gen
- MountPoints
org.freedesktop.UDisks2.Filesystem.BTRFSVolume
Represents a btrfs "filesystem" btrfs filesystem show
Methods:
- AddDevice(block)
- RemoveDevice(block)
- SetLabel(name)
- Balance() - either spread out metadata or convert a multidevice volume to raid1 (requires jobs). To be done for bcachefs
- CreateSubvolume()
- RemoveSubvolume()
- CreateSnapshot()
- Repair()
- Resize()
- Replace()?
btrfs replace
- replacing a failed device. Works different in bcachefs it seems. Need more investigation
Properties:
- Label
- UUID
- Used (used size)
- Size - total size
- MissingDevices - missing devices from a multi-device config
- Configuration - RAID1, RAID2 - tricky because btrfs supports different data and metadata configurations.
Multi device RAID configuration example:
$ btrfs device usage /
Device size: 9.09TiB
Device slack: 3.50KiB
Data,RAID10/4: 4.60TiB
Data,RAID10/2: 18.00GiB
Metadata,RAID10/4: 20.00GiB
System,RAID10/4: 16.00MiB
Unallocated: 4.46TiB
Missing devices is for multi device setups see below:
$ btrfs filesystem show
Label: 'fedora-test' uuid: cece4dd8-6168-4c88-a4a8-f7c51ed4f82b
Total devices 3 FS bytes used 2.08GiB
devid 1 size 11.92GiB used 3.56GiB path /dev/vda5
devid 2 size 0 used 0 path /dev/sda MISSING
devid 3 size 512.00MiB used 0.00B path /dev/sdc
org.freedesktop.UDisks2.Filesystem.BTRFSDevice
A device belonging to a "volume" as can be seen in btrfs filesystem show
Methods:
Properties:
- Volume
- Size
- Used
- Path or link to block device?
- Stats (btrfs device stats, bcachefs equivalent unknown)
$ btrfs filesystem show
Label: 'fedora-test' uuid: cece4dd8-6168-4c88-a4a8-f7c51ed4f82b
Total devices 3 FS bytes used 2.08GiB
devid 1 size 11.92GiB used 3.56GiB path /dev/vda5 <--------
All of this would be a ton of work, and careful design to validate the concepts work with bcachefs and btrfs. For btrfs it would be ideal to use libbtrfsutil with libblockdev if it could add support for getting information about btrfs devices (missing etc., usage), stats and everything else without parsing cli output.
- For bcachefs-tools, I have submitted an issue for providing a library to interact with.
- For libbtrfsutil, a list of things which are lacking should be collected.
Note that this issue is a very rough draft, there are a lot of open btrfs issues and a multi device pull request I haven't had time to go through yet. (and overall I will have limited time to dedicate until after January)
Thanks for this detailed design proposal! Cc: @cmurf as I don't feel qualified enough for btrfs
topology.
You may want to opt for virtual objects (i.e. on the same level as block objects and drive objects), as org.freedesktop.UDisks2.Filesystem
is always bound to a specific block object. Multidisk volumes comes in mind, this is somewhat similar in concept to MDRaid objects or LVM logical volumes.
So perhaps the concept of btrfs volume shoud be modelled as a org.freedesktop.UDisks2.BTRFSVolumeObject
with a single org.freedesktop.UDisks2.BTRFSVolume
interface attached to it. The object may then assume multiple block objects and would still represent a single filesystem UUID. Just an idea, I might be wrong.
Then I see an issue with instantiation (1:N mapping) - org.freedesktop.UDisks2.Filesystem.BTRFSSubvolume
. You may have only a single instance of a D-Bus interface attached on a single D-Bus object. In case a volume provides multiple subvolumes, where do you intend to attach such interfaces?
And for org.freedesktop.UDisks2.Filesystem.BTRFSDevice
, is this supposed to act as a PV in LVM terminology?
So if I understand correctly:
org.freedesktop.UDisks2.Filesystem.BTRFSDevice
should be attached toUDisksBlockObject
alongside with the usualorg.freedesktop.UDisks2.Filesystem
interfaceorg.freedesktop.UDisks2.Filesystem.BTRFSVolume
should be a separateUDisksModuleObject
, linking multiple object paths providing theorg.freedesktop.UDisks2.Filesystem.BTRFSDevice
interface- then
org.freedesktop.UDisks2.Filesystem.BTRFSSubvolume
would probably need to be another class ofUDisksModuleObject
linking to a single btrfs volume object. Would be nice to have multiple instances of theorg.freedesktop.UDisks2.Filesystem.BTRFSSubvolume
interface attached to a single btrfs volume object, that's however not possible with D-Bus AFAIK.
FYI, my old attempt to enhance device identification was #838. It might still happen one day, though it's a large, intrusive change.
Also, in case this turns out to an actual implementation and UDisksModuleObject
s are used, uevent handling should be done the intended way through the UDisksModuleObjectIface.process_uevent
method. I.e. avoid making global updates like in the lvm2 udisks module, it brings lots of issues and race conditions.
All of this would be a ton of work, and careful design to validate the concepts work with bcachefs and btrfs.
Forget about bcachefs
now. The btrfs object model is a completely separate thing, not affecting the core daemon interfaces. While the resulting object model for bcachefs
might be vastly similar, it would need a separate implementation and a new UDisks module anyway. Let's keep things simple for now.
In Cockpit we decided to represent btrfs similiar to LVM from an UI perspective
We are doing this in blivet and while it makes a lot of sense to replicate the LVM "structure" it also brings some issues especially for users that use btrfs as a simple filesystem only: they suddenly see a "btrfs volume" and have no idea what that means. We had some bug reports for Anaconda from people that expect simply reformatting btrfs to ext4 which we don't support because with this LVM-like representation the btrfs volume needs to be removed first (like running vgremove
before reformatting the PVs). I am not saying you shouldn't do it this way in Cockpit, just that you should expect some issues with representing btrfs as LVM.
The biggest challenge with btrfs would be getting the subvolume information -- btrfs needs to be mounted to gather subvolume information and we are definitely not doing this automatically in udisks (we've been there, it was a terrible idea) so the proposed BTRFSSubvolume
interface won't be created in most cases.
Worth adding that the standard org.freedesktop.UDisks2.Filesystem
and org.freedesktop.UDisks2.Block
interfaces will need to remain working as it is now, i.e. without any particular knowledge of filesystem structure. The added functionality should be added as a module (as it's going to be quite expensive I/O-wise) that needs to be explicitly activated first and would only work as an addition on top of existing interfaces.