lvmteam/lvm2

How to accurately determine actual disk usage in the presence of Logical Volumes spanning multiple physical disks?

Closed this issue · 4 comments

I am inquiring about the best approach to accurately retrieve the real disk usage when a disk is formatted as a Physical Volume and mounted through Logical Volumes. The conventional method using df -h only provides information about the usage of the Logical Volume. How can one obtain the actual disk usage in this scenario?

Example Scenario:
In a situation where the Logical Volume lv2 is mapped to two physical disks, namely sdb and sdc, the typical methods, such as df -h, only reflect the usage information of the Logical Volume. I am currently employing a method that involves reading the Volume Group metadata from the /etc/lvm/backup/ directory, extracting information about the Logical Volume type and its segment details, and then inferring the actual usage on the Physical Volumes based on this data. While theoretically feasible, the implementation of this method is expected to be complex.
image

Query:
I am seeking advice on whether there is a more straightforward or efficient approach to accomplish this task. Any suggestions or alternative methods would be greatly appreciated.

Thank you for your assistance!

Hi, it's pretty much unclear what is the goal here.

Surely you should NOT be trying to rewrite your very own lvm2 parser of our metadata format and you should simply use information reported by 'lvs' command which has numerous options that can give many detailed information.

On the other hand there is no way to detect real allocated space of a filesystem spanning across multiple disks unless you analyze the filesystem metadata and decode which blocks are allocated on which portion of device. And I've zero idea how would you want to present this sort of information for provisioned volume types like thin/vdo...

IMHO the main issue is that you are trying to 'blend' together information about filesystem and volume and there is not a simple 1-to-1 match always possible (i.e. the filesystem block can be shared between multiple volumes)

So I'd be probably likely happy with ''df -h' information as the reasonable good approximation for the filesystem usage, and if user is interested about the 'volume info' use the other tool for this - mixing this together is really going to be very expensive to obtain....

Few more comments - there are filesystems with volume management built-in (i.e. ZFS, BtrFs,...) that are more capable to obtain more details as they always deal with a single set of metadata.

lvm2 model is different and there is rather 'strict' segregation between each layer, and each layer has it's own and often 'internal' format of metadata.

Thus 'decrypting' consumed space of individual filesystem per each PV would be very hard work.

I understand what you're saying, but there is indeed a requirement in our current use case:
We have a host management feature that needs to display both the total usage information of the host's disks and the usage information for each individual disk. However, some disks are formatted as physical volumes, leading to the situation you described where we cannot determine the actual disk usage.
like this:
name used/totalSize used percent
sda 10G/50G 20%
sdb 15G/50G 30%
As you mentioned, parsing LVM-related metadata again would be complex. Perhaps I should communicate with my colleagues to consider altering this part of the requirement.
Thank you for your response.

Closing issue - since likely there is not much more we could do here ?