lvmteam/lvm2

lvresize allows truncating CoW snapshot exception store leading to Data% > 100

Opened this issue · 2 comments

Related to #163.

If lvresize is used to shrink a CoW snapshot's exception store and the LV does not contain a file system recognised by lvm2 the command will allow the exception store to be truncated beyond in-use exceptions. This does not invalidate the snapshot but leads to IO errors on the snapshot device and attempts to access beyond the end of the CoW device:

# lvcreate -n test1 -L 1G fedora
  Logical volume "test1" created.

# lvcreate -s -n test1-snap -L 1G fedora/test1
  Logical volume "test1-snap" created.

# dd if=/dev/zero of=/dev/fedora/test1 bs=1M count=600

# lvs fedora/test1-snap
  LV         VG     Attr       LSize  Pool  Origin Data%  Meta%  Move Log Cpy%Sync Convert
  test1-snap fedora swi-a-s---  1.00g       test1  59.63 

# lvresize -L512M fedora/test1-snap
  No file system found on /dev/fedora/test1-snap.
  Size of logical volume fedora/test1-snap changed from 1.00 GiB (256 extents) to 512.00 MiB (128 extents).
  Logical volume fedora/test1-snap successfully resized.

# lvs fedora/test1-snap
  LV         VG     Attr       LSize   Pool  Origin Data%  Meta%  Move Log Cpy%Sync Convert
  test1-snap fedora swi-a-s--- 512.00m       test1  119.25 <<<

The snapshot now gives IO errors on read:

# cat /dev/fedora/test1-snap > /dev/null 
cat: /dev/fedora/test1-snap: Input/output error

dmseg shows out-of-bounds access to the CoW device:

[68778.674367] cat: attempt to access beyond end of device
               dm-2: rw=524288, sector=1049096, nr_sectors = 8 limit=1048576
[68778.674398] cat: attempt to access beyond end of device
               dm-2: rw=524288, sector=1048576, nr_sectors = 8 limit=1048576
[68778.674400] cat: attempt to access beyond end of device
               dm-2: rw=524288, sector=1048584, nr_sectors = 8 limit=1048576
[68778.674401] cat: attempt to access beyond end of device
               dm-2: rw=524288, sector=1048592, nr_sectors = 8 limit=1048576
[68778.674402] cat: attempt to access beyond end of device
               dm-2: rw=524288, sector=1048600, nr_sectors = 8 limit=1048576
[68778.674404] cat: attempt to access beyond end of device
               dm-2: rw=524288, sector=1048608, nr_sectors = 8 limit=1048576
[68778.674405] cat: attempt to access beyond end of device
               dm-2: rw=524288, sector=1048616, nr_sectors = 8 limit=1048576
[68778.674406] cat: attempt to access beyond end of device
               dm-2: rw=524288, sector=1048624, nr_sectors = 8 limit=1048576
[68778.674407] cat: attempt to access beyond end of device
               dm-2: rw=524288, sector=1048632, nr_sectors = 8 limit=1048576
[68778.674408] cat: attempt to access beyond end of device
               dm-2: rw=524288, sector=1048640, nr_sectors = 8 limit=1048576
[68778.674488] Buffer I/O error on dev dm-3, logical block 130560, async page read

(dm-2 is fedora-test1--snap-cow, dm-3 is fedora-test1--snap)

It's not clear from a quick read of dm-snap-persistent.c whether the sectors_allocated from the snapshot status line can be relied upon to find the highest-allocated exception in order to check if a shrink is safe. Rejecting all attempts to shrink the CoW device size would be better than the current behaviour.

It's impressive that nothing crashes except the snapshot data.

Well - seems like we allowed some operation that should remain prohibited.

There was likely some idea in the past - which however needs further thinking through -

to support 'lvresize -V' to size 'virtual' volume - which in however needs quite some effort.

This then would allow to resize 'virtual' snapshot size (use visible snapshot volume) - and 'physical' snapshot size (storage for exception store) - allowing users to eventually deal with device size changes - but ATM - this is much easier usable with thin-pool - and dm-snapshot target is in some 'limbo' state so not much progress...