riemann/riemann

inject! hangs up with current timestamps

0xRand0m opened this issue · 3 comments

Hello fellow riemann enthusiasts,

while running tests with current event timestamps I encountered the problem that my test events were not injected into the streams. The following snippet showcases the issue. When run as is the test does not terminate. If you remove the last event (its timestamp has one more digit than the first two) it terminates within seconds.

(tests
  (deftest timestamp-test
    (prn (inject! [{:host "test.example.com" :service "test" :metric 1337 :time 109889374}
                   {:host "test.example.com" :service "test" :metric 1337 :time 109889375}
                   {:host "test.example.com" :service "test" :metric 1337 :time 1109889374}]))))

I took a peek into the source code but could not find any reasons for this behavior.
The obvious workaround is to use smaller timestamps but it would be nice to be able to use real life examples as test cases.

Thanks in advance for any help!

Hello,

I'm not able to reproduce your issue:

(logging/init {:file "/var/log/riemann/riemann.log"})

(let [host "127.0.0.1"]
  (tcp-server {:host host})
  (udp-server {:host host})
  (repl-server {:host host})
  (ws-server  {:host host}))

(streams
  (tap :foo))

(tests (deftest timestamp-test
         (let [result
               (inject! [{:host "test.example.com" :service "test" :metric 1337 :time 109889374}
                         {:host "test.example.com" :service "test" :metric 1337 :time 109889375}
                         {:host "test.example.com" :service "test" :metric 1337 :time 1109889374}])]
           (= (:foo result)
              [{:host "test.example.com" :service "test" :metric 1337 :time 109889374}
               {:host "test.example.com" :service "test" :metric 1337 :time 109889375}
               {:host "test.example.com" :service "test" :metric 1337 :time 1109889374}]))))
riemann test config.clj 
INFO [2019-02-11 19:30:55,013] main - riemann.bin - Loading /home/mathieu/Documents/config.clj
INFO [2019-02-11 19:30:55,073] main - riemann.repl - REPL server {:port 5557, :host 127.0.0.1} online

Testing riemann.config-test

Ran 1 tests containing 0 assertions.
0 failures, 0 errors.

What are you testing ?

Hi,

my apologies I should have added the whole config - but I managed to single out a possible culprit.
Your example ran fine for me too - but then I added (periodically-expire 5) before streams and this time I got a NullPointerException that does not occur when I remove the last test event.
I took a look at the reaper code but my understanding of the riemann internals is unfortunately limited.

(logging/init {:file "/var/log/riemann/riemann.log"})

(let [host "127.0.0.1"]
  (tcp-server {:host host})
  (udp-server {:host host})
  (repl-server {:host host})
  (ws-server  {:host host}))

(periodically-expire 5)
(let [index (index)]
(streams
  index
  (tap :foo)))

(tests (deftest timestamp-test
         (let [result
               (inject! [{:host "test.example.com" :service "test" :metric 1337 :time 109889374}
                         {:host "test.example.com" :service "test" :metric 1337 :time 109889375}
                         {:host "test.example.com" :service "test" :metric 1337 :time 1109889374}])]
           (= (:foo result)
              [{:host "test.example.com" :service "test" :metric 1337 :time 109889374}
               {:host "test.example.com" :service "test" :metric 1337 :time 109889375}
               {:host "test.example.com" :service "test" :metric 1337 :time 1109889374}]))))
WARN [2019-02-12 12:17:41,688] Thread-7 - riemann.core - instrumentation service caught
java.lang.NullPointerException: null
	at clojure.core$swap_BANG_.invokeStatic(core.clj:2345)
	at clojure.core$swap_BANG_.invoke(core.clj:2337)
	at riemann.test$tap_stream$stream__9675.invoke(test.clj:41)
	at riemann.core$stream_BANG_$fn__9747.invoke(core.clj:20)
	at riemann.core$stream_BANG_.invokeStatic(core.clj:19)
	at riemann.core$stream_BANG_.invoke(core.clj:15)
	at riemann.core$instrumentation_service$measure__9756.invoke(core.clj:59)
	at riemann.service.ThreadService$thread_service_runner__6513$fn__6514.invoke(service.clj:101)
	at riemann.service.ThreadService$thread_service_runner__6513.invoke(service.clj:100)
	at clojure.lang.AFn.run(AFn.java:22)
	at java.lang.Thread.run(Thread.java:748)
sanel commented

@bickerle few tips:

  1. use (is (= (:foo result) ...
  2. add (instrumentation {:enabled? false}) before declaring index

@mcorbin this is a weird bug that, when instrumentation is explicitly enabled or disabled, it will not happen. Unless instrumentation call is made with (instrumentation), instrumentation service will kick in from different thread automatically, while tests are running, causing riemann.test/*results* to be set to nil for instrumentation thread (*results* is per-thread variable).

I'm not able to track what is difference between explicit (instrumentation {}) call and implicit one, started from riemann by default. The only thing I noticed is that, when (instrumentation {}) is called even with :enabled? true, instrumentation thread will not be started during tests.

For now, here is pretty dumb patch that take care of *results* if multiple threads are involved.

diff --git a/src/riemann/test.clj b/src/riemann/test.clj
index 09f224d..b112945 100644
--- a/src/riemann/test.clj
+++ b/src/riemann/test.clj
@@ -36,9 +36,9 @@
   [name child]
   (fn stream [event]
     ; Record event
-    (-> *results*
-        (get name)
-        (swap! conj event))
+    (some-> *results*
+            (get name)
+            (swap! conj event))
 
     ; Forward downstream
     (child event)))