twitter/util

Future.sleep completes immediately for large values

adleong opened this issue · 18 comments

$ val f = Future.sleep(Duration.Top)
f: Future[Unit] = <function0>
$ f.poll
res13: Option[Try[Unit]] = Some(Return(()))

Expected behavior

Future.sleep(Duration.Top) should return a future that never completes. Similarly, calling Future.sleep with very large durations should return futures that take a very long time to complete.

Actual behavior

Future.sleep(Duration.Top) returns a future that completes immediately. Future.sleep(7720300000000L.millis) returns a future that completes immediately.

Steps to reproduce the behavior

See above.

Yeah, that sure looks like a bug.

What Timer are you passing into Future.sleep?

com.twitter.finagle.util.DefaultTimer

DefaultTimer is service loaded. Do you have finagle-netty4 on your classpath?

@adleong You can call DefaultTimer.toString in scala repl to figure out which one was loaded.

Ah right, "DefaultTimer(Netty4Timer)"

hmm i am unable to reproduce this. added this test case in finagle-netty4 and it passes.

  test("Future.sleep(Duration.Top)") {
    val timer = new Netty4Timer(new HashedWheelTimer())
    val f = Future.sleep(Duration.Top)(timer)
    assert(!f.isDefined)
    Thread.sleep(5000)
    assert(!f.isDefined)

    timer.stop()
  }

did a brief examination of the code and it looked ok to me.

Here is the full ammonite script I used to repro:

import $ivy.`com.twitter::finagle-http:7.0.0`

import com.twitter.util._
import com.twitter.finagle.util.DefaultTimer

val timer = DefaultTimer

val f = Future.sleep(Duration.Top)(timer)
Thread.sleep(5000)
println(timer)
println(f.poll)

and the output:

DefaultTimer(Netty4Timer)
Some(Return(()))

Interestingly, if I try to depend on finagle-core instead of finagle-http I get this output instead:

Aug 18, 2017 4:35:18 PM com.twitter.finagle.util.DefaultTimer$ <init>
WARNING: Can not service-load a timer. Using JavaTimer instead.
DefaultTimer(com.twitter.util.JavaTimer@9c0d0bd)
None

so it definitely seems dependent on the timer.

I think we should've never scheduled a timer task in case of Duration.Top in the first place (that would be a good fix).

As far as the explanation goes, it's as easy as long overflow that happens here in the N4 timer. Basically, unit.toNanos(delay) is Long.MaxValue. Java timer seems to not have this problem.

interesting. i see the issue — it can overflow a long inside of Netty's HashedWheelTimer.

https://github.com/netty/netty/blob/4.1/common/src/main/java/io/netty/util/HashedWheelTimer.java#L421

and it i didn't reproduce because it depends on when the timer's first task was run (startTime).

@adleong Any chance you want to contribute a fix (like checking if howLong == Duration.Top and returning Future.never)?

I'm on vacation this week but I'll try to put together a PR next week when I'm back. Should we also guard against durations that would overflow long?

i think checking in Future.sleep is more like an optimization rather than a fix.

the fix feels to me like it belongs in Netty4Timer as it is specific to that implementation.

I agree with Kevin. Maybe make two separate commits:

  • Optimization for Future.sleep
  • Fixes for Netty4Timer

@adleong Let us know which one (or maybe both?) do you want to work on.

After thinking about this some more, I think it really needs to be fixed in HashedWheelTimer, not in Netty4Timer. The allowable range of times that can be scheduled without overflow depends on the timer's startTime which is internal to HashedWheelTimer.

I'm happy to put together the optimization for Future.Sleep though.

After thinking about this some more, I think it really needs to be fixed in HashedWheelTimer, not in Netty4Timer. The allowable range of times that can be scheduled without overflow depends on the timer's startTime which is internal to HashedWheelTimer.

i suspect we could do a good job of mitigating it, but that makes sense.

and if you file something with netty, do let us know.

Fixed in #201.