grpc/grpc-swift

Connection timeout is ignored when retries are set to none

afeick opened this issue · 2 comments

Describe the bug

Connection timeout is ignored on last retry or when retries are set to none.

ConnectionManager.startConnecting uses a ConnectionBackoffIterator to determine the connectTimeout here:
Sources/GRPC/ConnectionManager.swift#L988

Unfortunately when the ConnectionBackoffIterator is on its last (or only) try, it returns nil for the next value here:
Sources/GRPC/ConnectionBackoff.swift#L140

That means the connection timeout is the bootstrap default, which in my use case is 10 seconds:
swift-nio-transport-services/Sources/NIOTransportServices/NIOTSConnectionBootstrap.swift#L57

To reproduce

            let connection = ClientConnection
                .usingTLSBackedByNIOSSL(on: eventLoop)
                .withConnectionTimeout(minimum: TimeAmount.seconds(1))
                .withConnectionBackoff(retries: ConnectionBackoff.Retries.none)
                .connect(host: host, port: port)

            let client = Myservice_MyServiceNIOClient(channel: connection)
            do {
                let call = client.getData(Myservice_GetDataRequest())
                let response = try call.response.wait()
            } catch {
                logger.error("request failed: \(error)")
            }

Expected behaviour

Call should timeout after 1 second, but it's taking 10 seconds.

Additional information

This is sorta ugly, but it does resolve the issue for me locally:

diff --git a/Sources/GRPC/ConnectionManager.swift b/Sources/GRPC/ConnectionManager.swift
index 35ee4110..7299eea8 100644
--- a/Sources/GRPC/ConnectionManager.swift
+++ b/Sources/GRPC/ConnectionManager.swift
@@ -942,18 +942,21 @@ extension ConnectionManager {
   // states. Must be called on the `EventLoop`.
   private func startConnecting() {
     self.eventLoop.assertInEventLoop()
+    let timeout = TimeAmount.seconds(timeInterval: self.connectionBackoff?.minimumConnectionTimeout ?? 20)
     switch self.state {
     case .idle:
       let iterator = self.connectionBackoff?.makeIterator()
       self.startConnecting(
         backoffIterator: iterator,
-        muxPromise: self.eventLoop.makePromise()
+        muxPromise: self.eventLoop.makePromise(),
+        connectTimeout: timeout
       )
 
     case let .transientFailure(pending):
       self.startConnecting(
         backoffIterator: pending.backoffIterator,
-        muxPromise: pending.readyChannelMuxPromise
+        muxPromise: pending.readyChannelMuxPromise,
+        connectTimeout: timeout
       )
 
     // We shutdown before a scheduled connection attempt had started.
@@ -973,7 +976,8 @@ extension ConnectionManager {
 
   private func startConnecting(
     backoffIterator: ConnectionBackoffIterator?,
-    muxPromise: EventLoopPromise<HTTP2StreamMultiplexer>
+    muxPromise: EventLoopPromise<HTTP2StreamMultiplexer>,
+    connectTimeout: TimeAmount? = nil
   ) {
     let timeoutAndBackoff = backoffIterator?.next()

Thanks for taking the time to file this. You're absolutely right, this is a bug and I think your fix is correct (although we shouldn't default the minimum timeout to 20, we should just defer to the underlying bootstrap). Could you open a PR to fix this?

@glbrntt - all tests are green locally and PR is submitted. Can you approve the workflow so I can make sure the build and tests are all good in GHA?