python/asyncio

socket.getservbyname() in _ipaddr_info could block

dwfreed opened this issue · 5 comments

#354 added a call to socket.getservbyname() in _ipaddr_info() to resolve service names into port numbers. However, local system configuration could result in getservbyname() blocking for some time, especially if the system is configured to query something like NIS or use the old hesiod protocol to look up this information. Even in the default configuration, if the disk holding /etc/services is unhappy with life, attempting to read /etc/services could take some time. This would bring the whole event loop to a grinding halt. getaddrinfo() (usually) will handle non-numeric port strings so long as AI_NUMERICSERV is not used, so it would probably be best to just let getaddrinfo() handle it.

(While looking into this, I discovered the Python getservbyname() implementation is not thread-safe, but that's a different bug...)

@ajdavis Any thoughts on this? Didn't this code change a few times? Personally I think getservbyname() is a silly API and apps should just pass integers around, and I wish PEP 3156 had specified that. For apps that pass ints there's no worry, since getservbyname() is only called when int(port) fails. Finally, how do we know that getaddrinfo() doesn't have the same blocking behavior?

@gvanrossum getaddrinfo() is run in an executor, so whether it blocks or not, the event loop isn't affected.

This discussion is getting a little circular. =) Here's more background than you asked for:

Yes, "getaddrinfo() is run in an executor, so whether it blocks or not, the event loop isn't affected." That's true, but until CPython issues 25924 and 26406 were fixed, getaddrinfo calls competed for a special getaddrinfo lock on Mac OS or BSD, so they could block each other. Hence, we've been trying to avoid using getaddrinfo to check or transform already-resolved hosts in asyncio. If the user tells asyncio to connect to "127.0.0.1", we shouldn't block on the getaddrinfo lock waiting to check whether that's a numeric IP or not, we should check some other way. That work was originally in #302 with some updates in #349, #354, and #357.

I've argued recently we should revert all this now that the latest Pythons have removed the getaddrinfo lock, but we didn't resolve that argument.

Ending the background recap and focusing on the discussion at hand:

I can see a simple change in _ensure_resolved in base_events.py that would, as you say, use getaddrinfo if port is not numeric.

Justification, for future readers' sake:

  • This change now means a slow call to getservbyname is running on a background thread, so it doesn't block other work on the event loop
  • Slow calls to getaddrinfo don't block other getaddrinfo calls now, because of CPython issues 25924 and 26406
  • This change simplifies create_connection by removing some of the complex code added in the PRs I noted above