rescan-scsi-bus.sh: function "findremapped" is too slow when there is 1K luns on server,is there any way to solve it?
LiuXing108 opened this issue · 5 comments
There is a loop nesting in function "findremapped".
we use "scsi-rescan -f -u -m" to scan all devices.
while read -r hctl sddev id_serial_old ; do
remapped=0
……
# If udev events updated the disks already, but the multipath device isn't update
# check for old devices to make sure we found remapped luns
if [ -n "$mp_enable" ] && [ $remapped -eq 0 ]; then
findmultipath "$sddev" $id_serial
if [ $? -eq 1 ] ; then
remapped=1
fi
fi
……
done < $tmpfile
Been thinking about this one but have not found any way to make it substantially faster. I read that in environments using Unicode that adding 'export LC_ALL="C" ' at the top of the script can speed calls to standard Unix string searching utilities (e.g. grep). My locale is en_CA.UTF-8 (i.e. not Unicode) and that export made a small improvement. I'm open to ideas from others. Also could you quantify "too slow"?
we found there is a O(n^2) in this func which comes from "while" and "findmultipath", and we use this shell like "scsi-rescan -f -u -m", so if a lun has m paths, loop times equals to m * 1K * 1K。
now we use a temporary file to record information of mpath in advance :
getallmultipathinfo()
{
local mp=
local uuid=
local dmtmp=
local maj_min=
local tmpfile=
for mp in $($DMSETUP ls --target=multipath | cut -f 1) ; do
[ "$mp" = "No" ] && break;
maj_min=$($DMSETUP status "$mp" | cut -d " " -f14)
if [ ! -L /dev/mapper/${mp} ]; then
echo "softlink /dev/mapper/${mp} not available."
continue
fi
local ret=$(readlink /dev/mapper/$mp 2>/dev/null)
if [[ $? -ne 0 || -z "$ret" ]]; then
echo "readlink /dev/mapper/$mp failed. check multipath status."
continue
fi
dmtmp=$(basename $ret)
uuid=$(cut -f2 -d- "/sys/block/$dmtmp/dm/uuid")
echo "$mp $maj_min $dmtmp $uuid" >> $TMPLUNINFOFILE
done
}
findmultipath(){
……
maj_min=$(cat "/sys/block/$dev/dev")
mp=$(cat $TMPLUNINFOFILE | grep -w "$maj_min" | cut -d " " -f1)
if [ -n "$mp" ]; then
if [ -n "$find_mismatch" ] ; then
uuid=$(cat $TMPLUNINFOFILE | grep -w "$maj_min" | cut -d " " -f4)
if [ "$find_mismatch" != "$uuid" ] ; then
addmpathtolist "$mp"
found_dup=1
fi
else
# Normal mode: Find the first multipath with the sdev
# and add it to the list
addmpathtolist "$mp"
return
fi
fi
……
}
findremapped(){
……
udevadm_settle 2>&1 /dev/null
echo "Done"
getallmultipathinfo
# See what changed and reload the respective multipath device if applicable
while read -r hctl sddev id_serial_old ; do
……
}
“Also could you quantify "too slow"?”:I will e-mail you a result of contrast later on
Looks good. I think you need something like 'truncate -s 0 $TMPLUNINFOFILE' at the start of getallmultipathinfo() since rescan-scsi-bus.sh may be called more than once. Look forward to your timings report showing a significant improvement.
We tested with “rescan-scsi-bus -f -u -m” on the original and optimized script. All SCSI devices have been mapped to dm device.
Here is the comparison of time spent before and after optimization:
Number of LUN | Number of Path | Before optimization | After optimization |
---|---|---|---|
128 | 16 | 213m21s | 2m35s |
256 | 16 | 967m54s | 5m27s |
We found the most time exhausted on these two code lines in original script:
findmultipath()
{
……
mp2=$($MULTIPATH -l "$mp" | egrep -o "dm-[0-9]+")
mp2=$(cut -f2 -d- "/sys/block/$mp2/dm/uuid")
……
}
They spent almost 5 seconds when there are 128LUNs * 16paths, and 10~15 seconds when 256LUNs * 16paths each execute. And the outer loop should loop 2k or 4k times.
After optimization,we do not need these two lines any more.
I have placed this patch in svn revision 973 now mirrored. Perhaps you could check if that as been done accurately and that the impressive speed-up is still present.