Throughput is caped on VDO volume
jnkraft opened this issue · 4 comments
Hardware:
Huawei 1288 server, CPU E5-2690 v4 @ 2.60GHz 14c/28t (maxed out via bios and other tunings), 8x8TB Samsung PM893 Sata SSD.
Software:
AlmaLinux 9.2, KVDO 8.2.1.6 from packages, mdadm RAID10 (disks passed as is by HBA controller).
Test params:
fio --randrepeat=1 --size=40G --name=fiotest --filename=testfio --numjobs=1 --stonewall --ioengine=libaio --direct=1 --bs=4k/64k/1m --iodepth=128 --rw=randwrite/write
Case 1: LVM on top of RAID, EXT4/XFS on top of LVM.
~500MB/s, ~130k IOPS on 4k; almost full saturation on 64k and 1m, ~1800MB (~450MB * 4 stripes). No RAID tuning for EXT4, XFS tuned automatically on creation.
Case 2: LVM VDO on top of RAID, EXT4/XFS/NoFS (fio to raw VDO volume) on top of VDO.
All tests capped to 400-450MB/s, IOPS divided accordingly to this cap, no matter of block size, random/seq write, FS or NoFS. --numjobs 1 or 32 alse give no diff.
VDO creation options:
lvcreate --type vdo -L 100G -V 300G \ #same with 1-5-10TB
--config 'allocation/vdo_slab_size_mb=32768
allocation/vdo_block_map_cache_size_mb=4096
allocation/vdo_use_sparse_index=0
allocation/vdo_use_metadata_hints=1
allocation/vdo_use_compression=1
allocation/vdo_use_deduplication=1
allocation/vdo_ack_threads=7
allocation/vdo_bio_threads=28
allocation/vdo_cpu_threads=14
allocation/vdo_write_policy=async #sync/async/auto no diff
allocation/vdo_hash_zone_threads=14
allocation/vdo_logical_threads=14
allocation/vdo_physical_threads=4' \ #1-4 no diff
-n lv_name storage/vdo_pool /dev/md2
Also created VDO volume with "default" options via alma's cocpit web GUI - results are the same.
Reads is not affected, only writes.
Looks like that cap is connected with single sata write max, but maybe it's false feeling.
Also thought it is connected with terribly slow discards on mkfs.* stage, but as i understand after reading other issues, discarding is a VDO weak point for now.
I'm out of any ideas. Was intending to build storage server for tons of VMs with dedup and compress without using ZFS or BTRFS...