This repository has code demonstrating two different approaches to executing tasks asynchronously when some of the tasks depend on the completion of others. The motivating example is making a pizza using the following tasks:
- Combining the ingredients of the dough.
- Letting the dough rise.
- Combining the ingredients of the sauce.
- Grating the cheese.
- Rolling out the risen dough into a crust.
- Putting the sauce on top of the crust.
- Putting the grated cheese on top of the sauce.
Some of these tasks depend on others. For example, you can't put the sauce on top of the crust (6) until both the ingredients for the sauce are combined (3) and the risen dough is rolled out into a crust (5). The complete DAG of dependencies looks like this:
1 --> 2 --> 5 --\
+--> 6 --\
3 --------------/ +--> 7
4 -----------------------/
Multiple cooks can speed up the process by performing tasks independently, as long as no task is started until all the tasks it depends on are done.
The FuturePizzaBuilder approach uses CompletableFutures to arrange for the tasks to be performed in an order consistent with this DAG.
The LatchPizzaBuilder
approach uses
CountDownLatches
to prevent tasks from
proceeding until the tasks they depend on have completed.
Unlike the future-based approach, this class uses fields to hold
intermediate state.
Because these fields are written before and read after calls to corresponding
high-level synchronizer methods (write -> countDown
and await -> read
),
there is no need for further synchronization, nor do the fields need to be
volatile.
This implementation reflects a more restrictive dependency graph than the one shown above:
It waits for 3, 4, and 5 to be ready before performing 6 and 7 together.
1 --> 2 --> 5 --\
3 --------------+--> 6, 7
4 --------------/
Both approaches use a fixed-size thread pool to run tasks asynchronously, and both simulate real work by sleeping for a given amount of time.
The PizzaDemo class runs both versions. Both produce identical output, except that the output lines might not be in the same order. This is because thread task scheduling can be affected by external factors like system load. Here's a sample run:
Started combining flour, water, yeast
Started combining tomato, oil, garlic, oregano
Started grating cheese
Finished combining flour, water, yeast
Started letting rise {flour, water, yeast}
Finished grating cheese
Finished letting rise {flour, water, yeast}
Started rolling out risen {flour, water, yeast}
Finished combining tomato, oil, garlic, oregano
Finished rolling out risen {flour, water, yeast}
Ready to bake grated cheese on {tomato, oil, garlic, oregano} on rolled-out risen {flour, water, yeast}
It's easy to verify that this is consistent with dependency order.
The CompletableFuture
-based version has several advantages over the latch-based one:
- It's shorter and easier to understand.
- It doesn't need to keep intermediate state in fields.
- It can be adapted to introduce new or different dependencies very easily; the latch-based code would require more intricate reasoning to do the same.
- In real usage, decisions about whether to perform a dependent task in the same thread as the task it depends on (or with a different Executor entirely) can be changed without having to redesign the code. This would be very difficult with latch-based approaches.
Other approaches are possible, including using plain Futures returned from ExecutorService.submit.
The decision to use a fixed-size thread pool and the choice of that fixed size have important consequences: If the tasks are not started in dependency order and the pool size is less than the number of tasks needed to make progress, the program can deadlock. For this toy example, using a cached thread pool, which can add threads when needed, would avoid the risk of deadlock, but in production settings this would risk system resource exhaustion, which could be much worse than deadlock.
ForkJoinTasks can be used to reduce the opportunities for deadlock, by forking and joining dependent tasks. When such tasks are used inside ForkJoinPools, work-stealing is used to make progress even when all active threads in the pool are running tasks that are waiting for their dependencies.