iXsystems/cinder

Fail gracefully if target/lun count is above 256 for CORE

Opened this issue · 4 comments

It depends on kern.cam.ctl.max_ports tunable.

Noted! Start investigation!

A first round of assessment:
Set kern.cam.ctl.max_luns 4 kern.cam.ctl.max_ports 4 and ctl_load YES in truenas core 12 tunable, I can see those variables updated after reboot. I can create more than 5 volumes using cinder driver. When not change freebsd default value kern.cam.ctl.max_ports 256, I can create >256 volumes using cinder driver. And use iscsiadm I can attach >5 iscsi target sessions from client os, and >256 iscsi target sessions successfully.
I am further looking into any negative effects from openstack operations for kern.cam.ctl.max_ports and kern.cam.ctl.max_luns.

Here is summary of existing TrueNAS cinder driver behaviors under different kern.cam.ctl.max_ports/kern.cam.ctl.max_luns configurations:

  1. When Openstack TrueNAS volume total number > kern.cam.ctl.max_ports/kern.cam.ctl.max_luns, openstack create or delete TrueNAS volume successful.
  2. When Openstack TrueNAS attached volumes total number > kern.cam.ctl.max_ports or kern.cam.ctl.max_luns, openstack attach or detach TrueNAS volume action timeout then failed.

Looking for further resolution for this issue.

Some update:

The actual attach/detach volume action timeout is happening in upstream cinder code here:
https://github.com/openstack/cinder/blob/392e27aa950374041fbfc827a160f835fd438e70/cinder/volume/driver.py#L1129
And then further os-brick upstream actual exception throw here:
https://github.com/openstack/os-brick/blob/a519dd8d07a65896b6151087c6b38b5294129bb6/os_brick/initiator/connectors/iscsi.py#L505

The possible solution without impact upstream code is to check cinder attached volume count < kern.cam.ctl.max_ports/kern.cam.ctl.max_luns here before return connection meta to upstream and fail gracefully:

def initialize_connection(self, volume, connector):

Working on actual code fix.