`CompletableFuture.get` could swallow `InterruptedException` if waiting future completes immediately after `Thread.interrupt`

In investigating of FLINK-19489 SplitFetcherTest.testNotifiesWhenGoingIdleConcurrent gets stuck with commit tree, I found that CompletableFuture.get could swallow InterruptedException if waiting future completes immediately after Thread.interrupt.

OpenJDK bug tracking

I have reported this to OpenJDK team, and they have accepted it as JDK-8254350.

In following sections, I use code from openjdk11. After glimpsing of openjdk repository, I think this problem may apply to openjdk9 and all above.

`CompletableFuture.waitingGet` should keep interrupt status if it returns no null value

Let's start with CompletableFuture.get and CompletableFuture.reportGet to get constraints CompletableFuture.waitingGet must follow.

public T get() throws InterruptedException, ExecutionException {
    Object r;
    if ((r = result) == null)
        r = waitingGet(true);
    return (T) reportGet(r);
}

/**
 * Reports result using Future.get conventions.
 */
private static Object reportGet(Object r)
    throws InterruptedException, ExecutionException {
    if (r == null) // by convention below, null means interrupted
        throw new InterruptedException();
    if (r instanceof AltResult) {
        Throwable x, cause;
        if ((x = ((AltResult)r).ex) == null)
            return null;
        if (x instanceof CancellationException)
            throw (CancellationException)x;
        if ((x instanceof CompletionException) &&
            (cause = x.getCause()) != null)
            x = cause;
        throw new ExecutionException(x);
    }
    return r;
}

The single parameter of reportGet comes from waitingGet and reportGet throws InterruptedException if and only if its parameter is null. This means that if waiting future is interrupted and completed at the same time in CompletableFuture.waitingGet, it has only two choices:

Return null and clear interrupt status, otherwise we will get double-interruption.
Return no-null value and keep interrupt status, otherwise we will lose that interruption and later interruptible method may hang.

Let's see whether, waitingGet conforms to rule-2 if it returns no-null value.

/**
 * Returns raw result after waiting, or null if interruptible and
 * interrupted.
 */
private Object waitingGet(boolean interruptible) {
    Signaller q = null;
    boolean queued = false;
    Object r;
    while ((r = result) == null) {
        if (q == null) {
            q = new Signaller(interruptible, 0L, 0L);
            if (Thread.currentThread() instanceof ForkJoinWorkerThread)
                ForkJoinPool.helpAsyncBlocker(defaultExecutor(), q);
        }
        else if (!queued)
            queued = tryPushStack(q);
        else {
            try {
                ForkJoinPool.managedBlock(q);
            } catch (InterruptedException ie) { // currently cannot happen
                q.interrupted = true;
            }
            if (q.interrupted && interruptible)
                // tag(interrupted): `ForkJoinPool.managedBlock` return due to interrupted,
                // interrupt status was cleared.
                break;
        }
    }
    if (q != null && queued) {
        q.thread = null;
        // tag(self-interrupt): this applies only to `CompletableFuture.join`.
        if (!interruptible && q.interrupted)
            Thread.currentThread().interrupt();
        if (r == null)
            cleanStack();
    }
    if (r != null || (r = result) != null)      // tag(assignment)
        postComplete();
    return r;
}

I add three placeholders tag(interrupted), tag(self-interrupt) and tag(assignment) in comments. Here are execution steps and assumptions:

Let's assume that an interruption occurs before tag(interrupted) and future completion, then tag(interrupted) will break while-loop in CompletableFuture.get path with interrupt status cleared.
if block in tag(self-interrupt) applies only to CompletableFuture.join which is a no interruptible blocking method, it restores interrupt status if interruption occurs in blocking wait. It is skipped in CompletableFuture.get path.
(r = result) != null in tag(assignment) assign result to return value and check it. What if the future is completed by other thread before tag(assingment) ? result field will have no-null value, then waitingGet will return no-null value, and lose interrupt status in q.interrupted.

Demonstration code

I have added a ready-to-run maven project with single file CompletableFutureGet.java to demonstrate the problem. You can use following steps to verify whether the problem exist in particular java version.

Configure your java env to java 1.8 or java 9 to 15.
mvn clean package in project directory.
java -jar target/openjdk-completablefuture-interruptedexception-0.1.0-SNAPSHOT.jar in project directory. The built executable has following environment variables to custom:
- FUTURE_WAIT_METHOD to switch among CompletableFuture.get()(using get), CompletableFuture.get(long timeout, TimeUnit unit)(using timed_get) and CompletableFuture.join()(using join), it defaults to get.
- MAX_RUNS to limit max loop count, it defaults to 1000.

In openjdk 1.8, the last step runs out loop count with no error. In openjdk 9 or above, when there is no Thread.sleep between futureGetThread.interrupt() and future.complete(null), it probably will print error log future.get completes, Thread.isInterrupted returns false and exit with error code 1.

Online REPL

See openjdk8-completablefuture-interruptedexception for openjdk8.
See openjdk11-completablefuture-interruptedexception for openjdk11.

kezhuw/openjdk-completablefuture-interruptedexception

`CompletableFuture.get` could swallow `InterruptedException` if waiting future completes immediately after `Thread.interrupt`

OpenJDK bug tracking

`CompletableFuture.waitingGet` should keep interrupt status if it returns no null value

Demonstration code

Online REPL

OpenJDK code

kezhuw/openjdk-completablefuture-interruptedexception

CompletableFuture.get could swallow InterruptedException if waiting future completes immediately after Thread.interrupt

OpenJDK bug tracking

CompletableFuture.waitingGet should keep interrupt status if it returns no null value

Demonstration code

Online REPL

OpenJDK code

`CompletableFuture.get` could swallow `InterruptedException` if waiting future completes immediately after `Thread.interrupt`

`CompletableFuture.waitingGet` should keep interrupt status if it returns no null value