Receiving exception on reload: java.lang.IllegalStateException: Could not find a suitable classloader to modify from clojure.lang.LazySeq@aa452d23
Closed this issue · 8 comments
Recently upgraded to v0.3.3 and receive the following exception when trying to reload our config. We are using pomegranate/add-dependencies
in the config. Riemann starts fine, but pukes on the add-dependencies
call on reload. Have seen related issues #914 and #928, but don't see a clear solution. Attempted to use config/depends
instead of add-dependencies
directly which got rid of the errors on reload, but then I couldn't start Riemann. Appears config/depends
is meant for "plugins" and complained about not finding some metadata file, if I recall.
INFO [2019-07-19 17:42:30,104] SIGHUP handler - riemann.bin - Caught SIGHUP, reloading
INFO [2019-07-19 17:42:30,107] SIGHUP handler - riemann.pubsub - Sweeping transient subscriptions.
ERROR [2019-07-19 17:42:30,264] SIGHUP handler - riemann.bin - Couldn't reload:
clojure.lang.Compiler$CompilerException: java.lang.IllegalStateException: Could not find a suitable classloader to modify from clojure.lang.LazySeq@aa452d23, compiling:(/etc/riemann/riemann.config:37:1)
at clojure.lang.Compiler.load(Compiler.java:7526)
at clojure.lang.Compiler.loadFile(Compiler.java:7452)
at clojure.lang.RT$3.invoke(RT.java:325)
at riemann.config$include.invokeStatic(config.clj:450)
at riemann.config$include.invoke(config.clj:427)
at riemann.bin$reload_BANG_.invokeStatic(bin.clj:46)
at riemann.bin$reload_BANG_.invoke(bin.clj:36)
at riemann.bin$handle_signals$fn__14141.invoke(bin.clj:62)
at riemann.bin.proxy$java.lang.Object$SignalHandler$d8c00ec7.handle(Unknown Source)
at sun.misc.Signal$1.run(Signal.java:212)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.IllegalStateException: Could not find a suitable classloader to modify from clojure.lang.LazySeq@aa452d23
at cemerick.pomegranate$add_classpath.invokeStatic(pomegranate.clj:54)
at cemerick.pomegranate$add_classpath.invoke(pomegranate.clj:42)
at clojure.lang.AFn.applyToHelper(AFn.java:154)
at clojure.lang.AFn.applyTo(AFn.java:144)
at clojure.core$apply.invokeStatic(core.clj:659)
at clojure.core$apply.invoke(core.clj:652)
at cemerick.pomegranate$add_dependencies.invokeStatic(pomegranate.clj:85)
at cemerick.pomegranate$add_dependencies.doInvoke(pomegranate.clj:57)
at clojure.lang.RestFn.invoke(RestFn.java:512)
at riemann.config$eval198.invokeStatic(riemann.config:37)
at riemann.config$eval198.invoke(riemann.config:37)
at clojure.lang.Compiler.eval(Compiler.java:7062)
at clojure.lang.Compiler.load(Compiler.java:7514)
... 10 common frames omitted
Can you share your config?
; -*- mode: clojure; -*-
; vim: filetype=clojure
; Log output to a file.
(logging/init {:file "/var/log/riemann/riemann.log"})
(logging/set-level "riemann.transport.tcp" ch.qos.logback.classic.Level/ERROR)
(logging/set-level "riemann.transport.udp" ch.qos.logback.classic.Level/ERROR)
;(logging/set-level "org.apache.http.wire" ch.qos.logback.classic.Level/TRACE)
; Accept metrics via UDP.
(udp-server :host "0.0.0.0"
:port 5555
:max-size 65535)
; Accept metrics via TCP.
(tcp-server :host "0.0.0.0"
:port 5555)
; Set up a TLS-protected port for secure metrics. The key can be generated with
; the following command:
; openssl pkcs8 -topk8 -nocrypt -in $tls_key -out $pkcs8_key
#_ ; TODO: Not enabled yet
(tcp-server :host "0.0.0.0"
:port 5554
:tls? true
:key "..."
:cert "..."
:ca-cert "...")
; Serve websocket for riemann-dash subscriptions.
(ws-server :host "0.0.0.0"
:port 5556)
; Sweep the index for expired events every 10 seconds.
(periodically-expire 10 {:keep-keys [:host :service :sys :stack :tags]})
(add-dependencies
:coordinates '[[amperity/vault-clj "0.7.0"]
[amperity/vault-clj-aws "0.0.1"]]
:repositories {"clojars" "https://clojars.org/repo"}
:exit-on-failure true)
(require
'[amperity.riemann.core :as amp]
:reload-all)
; Pass events into Amperity rules.
(let [record-index (default {:ttl 300} (index))]
(streams (amp/rules record-index)))
The working theory is around riemann.bin/ensure-dynamic-classloader
setting up the class loader on the current thread at startup, however, the call to riemann.bin/reload!
happens in a separate thread when handling the HUP signal. The assumption is that the dynamic classloader isn't available in that thread. Out of curiosity, I threw a call to riemann.bin/ensure-dynamic-classloader
into the config before the add-dependencies
call and Riemann starts with that config and also survives subsequent reloads under Java 8. Under Java 9 and 10, Riemann starts and survives the first reload, but throws a pile of weird exceptions on subsequent reloads:
INFO [2019-07-20 19:55:07,142] SIGHUP handler - riemann.bin - Caught SIGHUP, reloading
INFO [2019-07-20 19:55:07,160] SIGHUP handler - riemann.pubsub - Sweeping transient subscriptions.
INFO [2019-07-20 19:55:07,369] SIGHUP handler - riemann.core - Hyperspace core online
INFO [2019-07-20 19:55:09,517] SIGHUP handler - riemann.bin - Caught SIGHUP, reloading
INFO [2019-07-20 19:55:09,518] SIGHUP handler - riemann.pubsub - Sweeping transient subscriptions.
ERROR [2019-07-20 19:55:09,523] SIGHUP handler - riemann.bin - Couldn't reload:
clojure.lang.Compiler$CompilerException: java.lang.NoClassDefFoundError: clojure/lang/AFunction, compiling:(/app/etc/riemann.config:3:1)
at clojure.lang.Compiler.analyzeSeq(Compiler.java:7010)
at clojure.lang.Compiler.analyze(Compiler.java:6773)
at clojure.lang.Compiler.eval(Compiler.java:7059)
at clojure.lang.Compiler.load(Compiler.java:7514)
at clojure.lang.Compiler.loadFile(Compiler.java:7452)
at clojure.lang.RT$3.invoke(RT.java:325)
at riemann.config$include.invokeStatic(config.clj:450)
at riemann.config$include.invoke(config.clj:427)
at riemann.bin$reload_BANG_.invokeStatic(bin.clj:46)
at riemann.bin$reload_BANG_.invoke(bin.clj:36)
at riemann.bin$handle_signals$fn__14141.invoke(bin.clj:62)
at riemann.bin.proxy$java.lang.Object$SignalHandler$d8c00ec7.handle(Unknown Source)
at jdk.unsupported/sun.misc.Signal$InternalMiscHandler.handle(Signal.java:198)
at java.base/jdk.internal.misc.Signal$1.run(Signal.java:220)
at java.base/java.lang.Thread.run(Thread.java:844)
Caused by: java.lang.NoClassDefFoundError: clojure/lang/AFunction
at java.base/java.lang.ClassLoader.defineClass1(Native Method)
at java.base/java.lang.ClassLoader.defineClass(ClassLoader.java:1007)
at java.base/java.lang.ClassLoader.defineClass(ClassLoader.java:868)
at clojure.lang.DynamicClassLoader.defineClass(DynamicClassLoader.java:46)
at clojure.lang.Compiler$ObjExpr.getCompiledClass(Compiler.java:4979)
at clojure.lang.Compiler$FnExpr.parse(Compiler.java:4109)
at clojure.lang.Compiler.analyzeSeq(Compiler.java:7001)
... 14 common frames omitted
Caused by: java.lang.ClassNotFoundException: clojure.lang.AFunction
at java.base/java.net.URLClassLoader.findClass(URLClassLoader.java:466)
at clojure.lang.DynamicClassLoader.findClass(DynamicClassLoader.java:69)
at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:563)
at clojure.lang.DynamicClassLoader.loadClass(DynamicClassLoader.java:77)
at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:496)
... 21 common frames omitted
Exception in thread "Thread-6" java.lang.ExceptionInInitializerError
at riemann.streams$default$stream__8817.invoke(streams.clj:1411)
at riemann.core$stream_BANG_$fn__9755.invoke(core.clj:20)
at riemann.core$stream_BANG_.invokeStatic(core.clj:19)
at riemann.core$stream_BANG_.invoke(core.clj:15)
at riemann.core$instrumentation_service$measure__9764.invoke(core.clj:59)
at riemann.service.ThreadService$thread_service_runner__6512$fn__6513.invoke(service.clj:101)
at riemann.service.ThreadService$thread_service_runner__6512.invoke(service.clj:100)
at clojure.lang.AFn.run(AFn.java:22)
at java.base/java.lang.Thread.run(Thread.java:844)
Caused by: java.lang.ClassNotFoundException: clojure/lang/PersistentArrayMap
at java.base/java.lang.Class.forName0(Native Method)
at java.base/java.lang.Class.forName(Class.java:375)
at clojure.lang.RT.classForName(RT.java:2204)
at clojure.lang.RT.classForName(RT.java:2213)
at clojure.lang.Reflector.invokeStaticMethod(Reflector.java:199)
at clojure.lang.LispReader$EvalReader.invoke(LispReader.java:1308)
at clojure.lang.LispReader$DispatchReader.invoke(LispReader.java:843)
at clojure.lang.LispReader.read(LispReader.java:275)
at clojure.lang.LispReader.read(LispReader.java:210)
at clojure.lang.LispReader.access$200(LispReader.java:41)
at clojure.lang.LispReader$MetaReader.invoke(LispReader.java:944)
at clojure.lang.LispReader.read(LispReader.java:275)
at clojure.lang.LispReader.readDelimitedList(LispReader.java:1384)
at clojure.lang.LispReader$ListReader.invoke(LispReader.java:1232)
at clojure.lang.LispReader.read(LispReader.java:275)
at clojure.lang.LispReader.read(LispReader.java:210)
at clojure.lang.LispReader.access$200(LispReader.java:41)
at clojure.lang.LispReader$EvalReader.invoke(LispReader.java:1287)
at clojure.lang.LispReader$DispatchReader.invoke(LispReader.java:843)
at clojure.lang.LispReader.read(LispReader.java:275)
at clojure.lang.LispReader.read(LispReader.java:206)
at clojure.lang.LispReader.read(LispReader.java:195)
at clojure.lang.RT.readString(RT.java:1871)
at clojure.lang.RT.readString(RT.java:1866)
at riemann.streams$default$stream__8817$fn__8838.<clinit>(streams.clj:1411)
... 9 more
Let me know if you need any further information. The above was info was gleaned from a similar config running Riemann v0.3.3 in a Docker container using the openjdk:{8,9,10}
images from Docker hub.
Having been distracted by other work, I have come back to this. Some more details:
While I've mostly worked around the issues with reloads by making calls to riemann.bin/ensure-dynamic-classloader
in the config itself, the main issue I have now is that add-dependencies
doesn't seem to be modifying the classpath/classloader in a way that allows me to require namespaces that should be available after retrieving the declared dependencies. Will be digging into pomegranate and potentially dynapath to understand all that is going here.
Hmmm...inspecting the URLs associated with the class loader before and after a call to add-dependencies
does show the expected results.
So, I think I've largely tracked down where things are diverging. I believe this is likely something that only happens with the AOT compiled uberjar. Here's a rough rundown of the issue:
- JVM starts,
pomegranate/classloader-hierarchy
shows App and System/Platform class loaders. Passing(clojure.lang.RT/baseLoader)
or@clojure.lang.Compiler/LOADER
topomegranate/classloader-hierarchy
shows the same hierarchy. riemann.bin/ensure-dynamic-classloader
injects aclojure.lang.DynamicClassLoader
at the base of this hierarchy, however, the hierarchy remains unchanged when inspecting(clojure.lang.RT/baseLoader)
and@clojure.lang.Compiler/LOADER
.riemann.bin/set-config-file!
callspomegranate/add-classpath
to inject the base dir of the given config file into the classpath, modifying the class loader created byriemann.bin/ensure-dynamic-classloader
.riemann.conf/apply-config!
loads the Riemann config.
a.clojure.core/load
is called to evaluate the config. The class hierarchies seen by inspecting(clojure.lang.RT/baseLoader)
or@clojure.lang.Compiler/LOADER
now include 2 layers ofclojure.lang.DynamicClassLoader
objects. Subsequent calls to inspect those hierarchies returns a differentclojure.lang.DynamicClassloader
at the base of the hierarchy, but a consistent parent to that loader. Neither of theseclojure.lang.DynamicClassLoader
instances are the same as the one created byriemann.bin/ensure-dynamic-classloader
.
b. In our config, we eventually callpomegranate/add-dependencies
which succeeds and modifies the class loader configured byriemann.bin/ensure-dynamic-classloader
because it is attached to the current thread and that is Pomegranate's default behavior.
c. Calls to require namespaces that live in the config directory fail and calls to require namespaces fetched withpomegranate/add-dependencies
fail because neither of those paths are on the class path that the clojure runtime/compiler see.
Phew...I hope that makes sense. I have confirmed this behavior under OpenJDK 8, 9 and 10, but I'm still a little hazy about why everything described in 4a happens. Will try and put together a patch in the next day or two that hopefully resolves all of this for good.
Addressed in 0.3.4