HubSpot/dropwizard-guice

System.exit() is evil, pls no

Closed this issue · 13 comments

So I just discovered this using another dropwizard-guice library, but figured I should put it back here since it was caused by code based on this library.

So this line: https://github.com/HubSpot/dropwizard-guice/blob/master/src/main/java/com/hubspot/dropwizard/guice/GuiceBundle.java#L110

It's meant to terminate the server if initialization fails, which does make sense; but I discovered that, depending on what logging library is used, System.exit() terminates so forcefully, that it even kills the task that would print out that super-useful error message on the previous line. As such, my app started spitting out exit code 1's with no error message, no stacktrace; this was pretty terrible to debug. (Fun fact: If you put a breakpoint on the exit line, it actually gives the logger a chance to print, and you can see an error message again.)

In general, I've discovered that System.exit() is absolutely horrifying, in part because it's one of the few things that can stop threads mid-try-block, with no catch or finally - a dubious honor shared with (1) daemon threads, and (2) outright crash/segfault of the JVM.

Would strongly encourage usage of Dropwizard's Managed instance instead to handle initialization/takedown of thread pools and other threaded resources, which are usually the things keeping the JVM alive after the main thread kicks the bucket.

This is a tricky one. Because the exception could be thrown at any point during the injector creation, some objects may have been created, including ones that create non-daemon threads. But because the method threw an exception, we have no way to get a reference to the injector and clean these objects up and allow the JVM will shut down cleanly. Using a Managed won't help because the Managed objects are retrieved from the injector here which can't happen if injector creation throws an exception, so the managed will never fire (also, a lot of the code that needs to be cleaned up exists in our data access or service layer that doesn't know anything about Dropwizard or Managed).

This is a source of very hard to debug errors when running through Gradle:

FAILURE: Build failed with an exception.

* What went wrong:
Execution failed for task ':test'.
> Process 'Gradle Test Executor 2' finished with non-zero exit value 1

* Try:
Run with --stacktrace option to get the stack trace. Run with --info or --debug option to get more log output.

BUILD FAILED

Running with any of the options doesn't help and just by chance I figured out I was missing a binding.

Maybe it will help somebody when googling, that the abruptly ending build is because of a Guice error.

Maybe we could use System.err.println to make sure asynchronous logging doesn't interfere? Or do you have any ideas for how to ensure a graceful shutdown if injector creation fails?

Hm, I'd lean towards general Guice opinion on this one, since it seems like a potentially messy thing to go against the general patterns of Guice, but it seems like Guice module creation and bean instantiation is meant to be fast and stateless, when possible. Would definitely shy away from using e.g. System.err.println to work around the shutdown issue, since (a) it subverts normal logging mechanisms, (b) would still kill other potentially useful logs, and (c) could still forcefully kill other processing threads.

IMO, initialization of things like thread pools and other non-trivial/stateful startup work could be deferred to Managed.start(); I haven't yet looked into whether this will allow for proper ordered startup of services, but as long as Managed beans are injected in in order of creation, I think it'll be handled correctly. I use a module like this to handle it (with dropwizard-guicey), but a built-in mechanism would work as well:

public final class GuiceDropwizardSupportModule extends AbstractModule {
    @Override
    protected void configure() {
        Provider<Environment> environment = binder().getProvider(Environment.class);
        bindListener(new AbstractMatcher<TypeLiteral<?>>() {
            @Override
            public boolean matches(TypeLiteral<?> typeLiteral) {
                return Managed.class.isAssignableFrom(typeLiteral.getRawType());
            }
        }, new TypeListener() {
            @Override
            public <I> void hear(TypeLiteral<I> type, TypeEncounter<I> encounter) {
                encounter.register((InjectionListener<I>) injectee -> {
                    environment.get().lifecycle().manage((Managed) injectee);
                });
            }
        });
    }
}

I was wondering for about an hour, why my application silently exits... It looks like stderr logs not flushed immediately so I had to help them do so...

// Inside Application class.
static {
    Runtime.getRuntime().addShutdownHook(new Thread() {
      @Override
      public void run() {
        LoggingUtil.getLoggerContext().reset();
      }
    });
}

reset closes all log appenders and, as side-effect, flushes all underlying streams so, finally, I was able to see my error message

Why not throw an Error instead? Some background info.

The issue isn't that someone is catching the exception, so I don't think throwing an error would improve the situation. The exception propagates all the way up and kills the main thread, which tells the JVM to initiate shutdown. The issue is that injector creation failed, so we don't know what objects might have already been instantiated and we don't have any way to access the injector to find out or clean them up. In particular, there may be non-daemon threads which will prevent the JVM from ever shutting down; it will just hang forever. Here is an example of how this might happen: jhaber/dropwizard-guice-example@master...startup-failure
An executor service is bound in Guice as a singleton for use throughout the application, and it is shut down from a Managed. But because startup fails during injector creation, the Managed never fires and there's no reasonable way for the user to recover. If the System.exit was removed the app would just hang forever.

I think a reasonable compromise might be to sleep for a second to hopefully allow any async logging to get flushed, and then exit.

Thanks for that information. It should go into an in-line comment on that System.exit().

Instead of sleeping for a second, why not register @Vandalko 's shutdown hook right before exiting?

axiak commented

@PeterWippermann that shutdown hook is logger implementation dependent.

I believe Dropwizard still requires Logback so that may not be a concern.

We used to register a similar shutdown hook globally for our applications, but it sometimes caused problems during graceful shutdown because logging would stop as soon as shutdown is initiated. So if you have non-daemon threads that continue working, then anything they log won't show up. But in this case since we know we're going to hard exit I think it should be fine to terminate logging.

#110 is merged and was released as part of version 1.3.5