lvresize allows truncating CoW snapshot exception store leading to Data% > 100
Opened this issue · 2 comments
Related to #163.
If lvresize
is used to shrink a CoW snapshot's exception store and the LV does not contain a file system recognised by lvm2 the command will allow the exception store to be truncated beyond in-use exceptions. This does not invalidate the snapshot but leads to IO errors on the snapshot device and attempts to access beyond the end of the CoW device:
# lvcreate -n test1 -L 1G fedora
Logical volume "test1" created.
# lvcreate -s -n test1-snap -L 1G fedora/test1
Logical volume "test1-snap" created.
# dd if=/dev/zero of=/dev/fedora/test1 bs=1M count=600
# lvs fedora/test1-snap
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
test1-snap fedora swi-a-s--- 1.00g test1 59.63
# lvresize -L512M fedora/test1-snap
No file system found on /dev/fedora/test1-snap.
Size of logical volume fedora/test1-snap changed from 1.00 GiB (256 extents) to 512.00 MiB (128 extents).
Logical volume fedora/test1-snap successfully resized.
# lvs fedora/test1-snap
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
test1-snap fedora swi-a-s--- 512.00m test1 119.25 <<<
The snapshot now gives IO errors on read:
# cat /dev/fedora/test1-snap > /dev/null
cat: /dev/fedora/test1-snap: Input/output error
dmseg shows out-of-bounds access to the CoW device:
[68778.674367] cat: attempt to access beyond end of device
dm-2: rw=524288, sector=1049096, nr_sectors = 8 limit=1048576
[68778.674398] cat: attempt to access beyond end of device
dm-2: rw=524288, sector=1048576, nr_sectors = 8 limit=1048576
[68778.674400] cat: attempt to access beyond end of device
dm-2: rw=524288, sector=1048584, nr_sectors = 8 limit=1048576
[68778.674401] cat: attempt to access beyond end of device
dm-2: rw=524288, sector=1048592, nr_sectors = 8 limit=1048576
[68778.674402] cat: attempt to access beyond end of device
dm-2: rw=524288, sector=1048600, nr_sectors = 8 limit=1048576
[68778.674404] cat: attempt to access beyond end of device
dm-2: rw=524288, sector=1048608, nr_sectors = 8 limit=1048576
[68778.674405] cat: attempt to access beyond end of device
dm-2: rw=524288, sector=1048616, nr_sectors = 8 limit=1048576
[68778.674406] cat: attempt to access beyond end of device
dm-2: rw=524288, sector=1048624, nr_sectors = 8 limit=1048576
[68778.674407] cat: attempt to access beyond end of device
dm-2: rw=524288, sector=1048632, nr_sectors = 8 limit=1048576
[68778.674408] cat: attempt to access beyond end of device
dm-2: rw=524288, sector=1048640, nr_sectors = 8 limit=1048576
[68778.674488] Buffer I/O error on dev dm-3, logical block 130560, async page read
(dm-2 is fedora-test1--snap-cow, dm-3 is fedora-test1--snap)
It's not clear from a quick read of dm-snap-persistent.c
whether the sectors_allocated
from the snapshot status line can be relied upon to find the highest-allocated exception in order to check if a shrink is safe. Rejecting all attempts to shrink the CoW device size would be better than the current behaviour.
It's impressive that nothing crashes except the snapshot data.
Well - seems like we allowed some operation that should remain prohibited.
There was likely some idea in the past - which however needs further thinking through -
to support 'lvresize -V' to size 'virtual' volume - which in however needs quite some effort.
This then would allow to resize 'virtual' snapshot size (use visible snapshot volume) - and 'physical' snapshot size (storage for exception store) - allowing users to eventually deal with device size changes - but ATM - this is much easier usable with thin-pool - and dm-snapshot target is in some 'limbo' state so not much progress...