mjiderhamn/classloader-leak-prevention

Postgresql shared timer thread being stopped

Closed this issue · 15 comments

I am using classloader-leak-prevention-1.15.4.jar with Postgresql JDBC (postgresql-9.4-1201-jdbc41.jar) in the tomcat ${catalilna.base}/lib folder (and not in the webapp) but the listener is stopping a thread that probably shouldn't be tied to the webapp classloader (but apparently it is).

Stopping Timer thread 'PostgreSQL-JDBC-SharedTimer-1' running in classloader.

When the new version of the app tries to deploy and use Postgresql jdbc driver it gets an error like this:

java.lang.IllegalStateException: Timer already cancelled.
        at java.util.Timer.sched(Timer.java:397) ~[na:1.8.0_102]
        at java.util.Timer.schedule(Timer.java:193) ~[na:1.8.0_102]
        at org.postgresql.jdbc2.AbstractJdbc2Connection.addTimerTask(AbstractJdbc2Connection.java:1319) ~[postgresql-9.4-1201-jdbc41.jar:9.4]
        at org.postgresql.jdbc2.AbstractJdbc2Statement.startTimer(AbstractJdbc2Statement.java:3494) ~[postgresql-9.4-1201-jdbc41.jar:9.4]
        at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:569) ~[postgresql-9.4-1201-jdbc41.jar:9.4]
        at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:420) ~[postgresql-9.4-1201-jdbc41.jar:9.4]
        at org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery(AbstractJdbc2Statement.java:305) ~[postgresql-9.4-1201-jdbc41.jar:9.4]
        at org.springframework.jdbc.core.JdbcTemplate$1.doInPreparedStatement(JdbcTemplate.java:688) ~[spring-jdbc-4.2.4.RELEASE.jar:4.2.4.RELEASE]
        at org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTemplate.java:629) ~[spring-jdbc-4.2.4.RELEASE.jar:4.2.4.RELEASE]
        at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:680) ~[spring-jdbc-4.2.4.RELEASE.jar:4.2.4.RELEASE]
        at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:712) ~[spring-jdbc-4.2.4.RELEASE.jar:4.2.4.RELEASE]
        at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:722) ~[spring-jdbc-4.2.4.RELEASE.jar:4.2.4.RELEASE]
        at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:777) ~[spring-jdbc-4.2.4.RELEASE.jar:4.2.4.RELEASE]

I haven't looked into this but I guessing that whichever app uses Postgresql first will get that thread tied to its webapp classloader and postgresql will stop working if that app is ever undeployed.

They said they fixed a classloader memory leak in 9.4.1200 but I am still seeing it in 1201.
Fix Timer thread classloader leak - 197

I tried using the latest jdbc driver a couple weeks ago and ran into an issue so I backed off, not sure why I ended up on 1201.

This is really a driver bug but its a case where leaving the leak in place is probably better since it kills the driver.

Thanks for another report @hdeadman. I belive I have created a workaround for this. Would you mind giving it a try before I release a new version? I have no intent of backporting this to 1.x however, so this assumes you'd either upgrade to 2.0 or backport my solution yourself.

Trying to make sense of the change they made in the Postegresql driver and why it's still a problem. @hdeadman , are you reusing the same connection pool between redeploys, or in multiple applications within the server?

I will look into upgrading to 2.0 tomorrow to try it out. There are some apps that run in same Tomcat that are sharing data sources defined in Tomcat. I do have some apps that don't share connection pools across applications but I can't recall which servers I have seen this problem on, I will have to check when I am in the office tomorrow. I assume the connection pool would be re-used across re-deploys of the same application because it is looked up via JNDI, not defined within the application.

I have confirmed that it can happen when re-deploying an application that is deployed by itself on a tomcat cluster/server. Sharing a connection pool across applications is not necessary to recreate. I am deploying using parallel versioned deployments but in this case I am just re-deploying the same version over top of the old version (copying appname##2.0.0.war into farm deploy folder and letting tomcat farm deployer deploy the app (which first results in undeploy of the app).

Now I will try 2.0.

I saw this error on undeploy with the latest 2.0 snapshot but it looks like I can redeploy multiple times without the postgresql timer being shut down.

java.lang.NullPointerException: null
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.8.0_102]
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[na:1.8.0_102]
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_102]
        at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_102]
        at se.jiderhamn.classloader.leak.prevention.cleanup.X509TrustManagerImplUnparseableExtensionCleanUp.cleanUp(X509TrustManagerImplUnparseableExtensionCleanUp.java:38) ~[classloader-leak-prevention-core-2.0.3-20161011.213922-1.jar:na]
        at se.jiderhamn.classloader.leak.prevention.ClassLoaderLeakPreventor.runCleanUps(ClassLoaderLeakPreventor.java:226) [classloader-leak-prevention-core-2.0.3-20161011.213922-1.jar:na]
        at se.jiderhamn.classloader.leak.prevention.ClassLoaderLeakPreventorListener.contextDestroyed(ClassLoaderLeakPreventorListener.java:242) [classloader-leak-prevention-servlet-2.0.3-20161011.213923-1.jar:na]

Thanks. The NPE was related to #58, and should be fixed now also on master.

In the 1.x version of this listener I could extend the listener and override the logging methods. I am currently using SLF4J/Logback and I was going to provide an implementation of Logger that used that but I didn't see an easy way to extend the listener and pass in my instance of the Logger so it could be set on ClassLoaderLeakPreventorFactory. Should "void contextInitialized(final ServletContext servletContext)" be overloaded with a version that takes a Logger?

It's not a big deal for me because I am using the org.slf4/jul-to-slf4j bridge so the default JULLogger happens to work for me but it isn't clear to me how one is supposed to specify their own logger (or use StdLogger).

Another option would be to have more implementations of Logger and have a context parameter for choosing between them. If the context parameter specified the Logger impl class name then you wouldn't have to bundle all the impl classes.

I might have spoke too soon regarding the postgresql issue being resolved. I am still seeing the timer get cancelled. I will add some debug to make sure the PostgresqlJdbcThreadInitiator is running.

Initial Deploy on startup (with some debug added):

2016-10-12 14:34:52 INFO  [] [] [] [] [se.jiderhamn.classloader.leak.prevention.ClassLoaderLeakPreventor] Initializing by loading some known offenders with leak safe classloader
2016-10-12 14:34:52 WARN  [] [] [] [] [se.jiderhamn.classloader.leak.prevention.ClassLoaderLeakPreventor] About to look for postgresql shared timer
2016-10-12 14:34:52 WARN  [] [] [] [] [se.jiderhamn.classloader.leak.prevention.ClassLoaderLeakPreventor] About invoke postgresql getSharedTimer()

Undeploy as part of re-deploy, still stopping Timer thread.

2016-10-12 14:39:04 WARN  [] [] [] [] [se.jiderhamn.classloader.leak.prevention.ClassLoaderLeakPreventor] private java.util.Map javax.faces.component.UIComponentBase.descriptors is not static
2016-10-12 14:39:04 ERROR [] [] [] [] [se.jiderhamn.classloader.leak.prevention.ClassLoaderLeakPreventor] Internal registry of java.beans.PropertyEditorManager not found
2016-10-12 14:39:04 INFO  [] [] [] [] [se.jiderhamn.classloader.leak.prevention.ClassLoaderLeakPreventor] Looping 4 RMI Targets to find leaks
2016-10-12 14:39:04 INFO  [] [] [] [] [se.jiderhamn.classloader.leak.prevention.ClassLoaderLeakPreventor] Looping 4 RMI Targets to find leaks
2016-10-12 14:39:04 WARN  [] [] [] [] [se.jiderhamn.classloader.leak.prevention.ClassLoaderLeakPreventor] Stopping Timer thread 'Timer-0' running in classloader.
2016-10-12 14:39:04 WARN  [] [] [] [] [se.jiderhamn.classloader.leak.prevention.ClassLoaderLeakPreventor] Stopping Timer thread 'Timer-1' running in classloader.
2016-10-12 14:39:04 WARN  [] [] [] [] [se.jiderhamn.classloader.leak.prevention.ClassLoaderLeakPreventor] Stopping Timer thread 'Timer-2' running in classloader.
2016-10-12 14:39:04 WARN  [] [] [] [] [se.jiderhamn.classloader.leak.prevention.ClassLoaderLeakPreventor] Stopping Timer thread 'Timer-3' running in classloader.
2016-10-12 14:39:04 WARN  [] [] [] [] [se.jiderhamn.classloader.leak.prevention.ClassLoaderLeakPreventor] Stopping Timer thread 'PostgreSQL-JDBC-SharedTimer-1' running in classloader.
2016-10-12 14:39:04 INFO  [] [] [] [] [se.jiderhamn.classloader.leak.prevention.ClassLoaderLeakPreventor] Custom ThreadLocal of type javax.faces.context.FacesContext$1: javax.faces.context.FacesContext$1@44eae0f with value null will be remove()d from Thread[ContainerBackgroundProcessor[StandardEngine[Catalina]],5,main]
2016-10-12 14:39:05 INFO  [] [] [] [] [se.jiderhamn.classloader.leak.prevention.ClassLoaderLeakPreventor] Since Java 1.6+ is used, we can call public static final void java.util.ResourceBundle.clearCache(java.lang.ClassLoader)
2016-10-12 14:39:05 INFO  [] [] [] [] [se.jiderhamn.classloader.leak.prevention.ClassLoaderLeakPreventor] Releasing web app classloader from Apache Commons Logging

Still getting Timer cancelled error.

java.lang.IllegalStateException: Timer already cancelled.
        at java.util.Timer.sched(Timer.java:397) ~[na:1.8.0_102]
        at java.util.Timer.schedule(Timer.java:193) ~[na:1.8.0_102]
        at org.postgresql.jdbc2.AbstractJdbc2Connection.addTimerTask(AbstractJdbc2Connection.java:1319) ~[postgresql-9.4-1201-jdbc41.jar:9.4]
        at org.postgresql.jdbc2.AbstractJdbc2Statement.startTimer(AbstractJdbc2Statement.java:3494) ~[postgresql-9.4-1201-jdbc41.jar:9.4]
        at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:569) ~[postgresql-9.4-1201-jdbc41.jar:9.4]
        at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:420) ~[postgresql-9.4-1201-jdbc41.jar:9.4]
        at org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery(AbstractJdbc2Statement.java:305) ~[postgresql-9.4-1201-jdbc41.jar:9.4]
        at org.springframework.jdbc.core.JdbcTemplate$1.doInPreparedStatement(JdbcTemplate.java:688) ~[spring-jdbc-4.2.4.RELEASE.jar:4.2.4.RELEASE]
        at org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTemplate.java:629) ~[spring-jdbc-4.2.4.RELEASE.jar:4.2.4.RELEASE]
        at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:680) ~[spring-jdbc-4.2.4.RELEASE.jar:4.2.4.RELEASE]
        at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:712) ~[spring-jdbc-4.2.4.RELEASE.jar:4.2.4.RELEASE]
        at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:722) ~[spring-jdbc-4.2.4.RELEASE.jar:4.2.4.RELEASE]
        at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:777) ~[spring-jdbc-4.2.4.RELEASE.jar:4.2.4.RELEASE]

Turned out the JUnit tests passed due to interference from the MySQL JDBC driver used in other tests...

Thinking about this some more I have decided to switch approach. Even if the timer thread is force started in the correct ClassLoader, it may be stopped and restarted during the lifespan of the application. To avoid that we'd need to circumvent the mechanisms created by the driver authors, and I don't dare do that without understanding their motives.

I have some ideas how to fix the driver itself too, so hopefully that will result in a pull request.

@hdeadman , would you mind trying out the new approach?

That seems to work better. I can't say there are no leaks but at least the driver still works after a re-deploy. If your pull request to the postgresql driver people makes it into a release I will try it out, but last time I tried the latest driver it was blowing up when Spring would start a transaction and it would try to process a result set from the "set transaction isolation ... ". I captured some debug from a working driver and was going to compare to debug from the latest driver but the debug jdbc parameter was not working in the new driver.

Here is debug from test with latest jar:

2016-10-12 22:00:32 INFO  [] [] [] [] [se.jiderhamn.classloader.leak.prevention.ClassLoaderLeakPreventor] Looping 4 RMI Targets to find leaks
2016-10-12 22:00:32 INFO  [] [] [] [] [se.jiderhamn.classloader.leak.prevention.ClassLoaderLeakPreventor] Looping 4 RMI Targets to find leaks
2016-10-12 22:00:32 WARN  [] [] [] [] [se.jiderhamn.classloader.leak.prevention.ClassLoaderLeakPreventor] Stopping Timer thread 'Timer-0' running in classloader.
2016-10-12 22:00:32 WARN  [] [] [] [] [se.jiderhamn.classloader.leak.prevention.ClassLoaderLeakPreventor] Stopping Timer thread 'Timer-1' running in classloader.
2016-10-12 22:00:32 WARN  [] [] [] [] [se.jiderhamn.classloader.leak.prevention.ClassLoaderLeakPreventor] Stopping Timer thread 'Timer-2' running in classloader.
2016-10-12 22:00:32 WARN  [] [] [] [] [se.jiderhamn.classloader.leak.prevention.ClassLoaderLeakPreventor] Stopping Timer thread 'Timer-3' running in classloader.
2016-10-12 22:00:32 WARN  [] [] [] [] [se.jiderhamn.classloader.leak.prevention.ClassLoaderLeakPreventor] Stopping Thread 'Thread[Quartz Scheduler [quartzScheduler],5,main]' of type org.springframework.scheduling.quartz.SchedulerFactoryBean$1 running in web app after 5000 ms
2016-10-12 22:00:32 WARN  [] [] [] [] [se.jiderhamn.classloader.leak.prevention.ClassLoaderLeakPreventor] Changing contextClassLoader of Thread[PostgreSQL-JDBC-SharedTimer-1,5,main] to java.net.URLClassLoader@736e9adb
2016-10-12 22:00:32 WARN  [] [] [] [] [se.jiderhamn.classloader.leak.prevention.ClassLoaderLeakPreventor] AccessControlContext of thread Thread[PostgreSQL-JDBC-SharedTimer-1,5,main] used custom combiner - unsetting

Here is my Pull Request on the Postgresql JDBC driver.

Just released version 2.1.0 which includes the changes made to resolve this.