miketeo/pysmb

ValueError: filedescriptor out of range in select()

itssimon opened this issue · 2 comments

In a long-running process (e.g. a Celery worker) the following error occurs occasionally indicating that there are too many file descriptors for select().

ValueError: filedescriptor out of range in select()
  File "download.py", line 292, in _smb_connect
    smb.connect(ip_address)
  File "smb/SMBConnection.py", line 122, in connect
    self._pollForNetBIOSPacket(timeout)
  File "smb/SMBConnection.py", line 590, in _pollForNetBIOSPacket
    ready, _, _ = select.select([ self.sock.fileno() ], [ ], [ ], timeout)

select() has a limitation on the number of file descriptors it can watch. Often this is 1024, at least on Unix systems.

Below is a patched SMBConnection class that uses poll() instead of select() to work around this limitation. However, poll() is only supported on Unix systems, which is why a potential PR to fix this would have to implement both methods depending on the OS.

class PatchedSMBConnection(SMBConnection):
    def _pollForNetBIOSPacket(self, timeout: int) -> None:
        expiry_time = time.time() + timeout
        data = self._read(4, expiry_time)
        _, flags, length = struct.unpack(">BBH", data)
        if flags & 0x01:
            length = length | 0x10000
        data += self._read(length, expiry_time)
        self.feedData(data)

    def _read(self, length: int, expiry_time: float) -> bytes:
        data = b""
        while length > 0:
            try:
                if expiry_time < time.time():
                    raise SMBTimeout
                poller = select.poll()
                poller.register(self.sock, select.POLLIN)
                timeout = max(1, expiry_time - time.time()) * 1000
                ready = poller.poll(timeout)
                # ready, _, _ = select.select([ self.sock.fileno() ], [ ], [ ], timeout)
                if not ready:
                    raise SMBTimeout
                d = self.sock.recv(length)
                if len(d) == 0:
                    raise NotConnectedError
                data += d
                length -= len(d)
            except select.error as e:
                if isinstance(e, tuple):
                    if e[0] not in (errno.EINTR, errno.EAGAIN):
                        raise e
                else:
                    raise e
        return data

Just wanted to document this issue and workaround here to start with. If anyone wants to work on a PR that uses the above code, feel free to do so.

@itssimon : Thanks for your code. I believe your situation is unique as most applications won't have 1000+ open file descriptors floating around.
I would flag your issue as an improvement. Someone else might find your code useful if they encounter the same issue.

+1 on this issue.

I am running into this as well, unfortunately this specific ticket didn't come up in my Google searches but I did find prompt-toolkit/python-prompt-toolkit#354 and someone posted a fairly similar fix for this.