Excessive logging for invalid SDK key
seangatewood opened this issue ยท 7 comments
Is this a support request?
No
Describe the bug
When an LDClient has a bad SDK key (because it expired or was bad from the beginning), it repeatedly logs about it and triggers any DataSourceStatusProvider.StatusListener
s
To reproduce
- Initialize a LDClient connected to environment X
- Wait for log to say
INFO com.launchdarkly.sdk.server.LDClient.DataSource - Initialized LaunchDarkly client.
- (I set a breakpoint after initializing it and configured to stop only the test thread)
- Go into LaunchDarkly and reset environment X's SDK key (keep old key around for 0 hours)
- The log will repeatedly say
ERROR com.launchdarkly.sdk.server.LDClient.DataSource - Error in stream connection (giving up permanently): HTTP error 401 (invalid SDK key)
- Upon further inspection, it seems this bug also occurs if you just set the SDK key to any random string from the beginning like
pizza
The same issue exists for StatusListener
s
We were also considering setting up alerts for sites that have expired SDK keys, and we were considering using the DataSourceStatusProvider
for that.
However, if I add this code into my test:
DataSourceStatusProvider dataSourceStatusProvider = client.getDataSourceStatusProvider();
dataSourceStatusProvider.addStatusListener(status -> {
DataSourceStatusProvider.State state = status.getState();
if (state == DataSourceStatusProvider.State.OFF) {
System.out.println("\033[1;33m" + "My listener was called!" + "\033[0m"); // yellow
}
});
then you will see that the listener is also repeatedly called:
Upon further inspection, if I make my listener always print what the State
was:
status -> {
DataSourceStatusProvider.State state = status.getState();
System.out.println("\033[1;33m" + "My listener was called! " + state.name() + "\033[0m"); // yellow
}
then it seems to oscillate between INTERRUPTED
and OFF
:
Expected behavior
If it really "gives up permanently" (which it should do for an invalid SDK key), then it should only log that message once. ๐ That also seems to be what this code is trying to do?
Also we kinda expected addStatusListener to only trigger for status changes, which should only be once for a non-recoverable error like having an invalid SDK key.
Logs
[Test worker] INFO com.launchdarkly.sdk.server.LDClient.DataSource - Enabling streaming API
[Test worker] INFO com.launchdarkly.sdk.server.LDClient.DataSource - Starting EventSource client using URI: <hidden>
[Test worker] INFO com.launchdarkly.sdk.server.LDClient - Waiting up to 5000 milliseconds for LaunchDarkly client to start...
[okhttp-eventsource-stream-[]-0] INFO com.launchdarkly.sdk.server.LDClient.DataSource - Connected to EventSource stream.
[okhttp-eventsource-events-[]-0] INFO com.launchdarkly.sdk.server.LDClient.DataSource - Initialized LaunchDarkly client.
[okhttp-eventsource-stream-[]-0] WARN com.launchdarkly.sdk.server.LDClient.DataSource - Connection unexpectedly closed
[okhttp-eventsource-stream-[]-0] WARN com.launchdarkly.sdk.server.LDClient.DataSource - Error in stream connection (will retry): java.io.EOFException
[okhttp-eventsource-stream-[]-0] INFO com.launchdarkly.sdk.server.LDClient.DataSource - Waiting 1477 milliseconds before reconnecting...
[okhttp-eventsource-stream-[]-0] ERROR com.launchdarkly.sdk.server.LDClient.DataSource - Error in stream connection (giving up permanently): HTTP error 401 (invalid SDK key)
[okhttp-eventsource-stream-[]-0] WARN com.launchdarkly.sdk.server.LDClient.DataSource - Connection unexpectedly closed
[okhttp-eventsource-stream-[]-0] WARN com.launchdarkly.sdk.server.LDClient.DataSource - Error in stream connection (will retry): java.io.EOFException
[okhttp-eventsource-stream-[]-0] INFO com.launchdarkly.sdk.server.LDClient.DataSource - Waiting 3682 milliseconds before reconnecting...
[okhttp-eventsource-stream-[]-0] ERROR com.launchdarkly.sdk.server.LDClient.DataSource - Error in stream connection (giving up permanently): HTTP error 401 (invalid SDK key)
[okhttp-eventsource-stream-[]-0] WARN com.launchdarkly.sdk.server.LDClient.DataSource - Connection unexpectedly closed
[okhttp-eventsource-stream-[]-0] WARN com.launchdarkly.sdk.server.LDClient.DataSource - Error in stream connection (will retry): java.io.EOFException
[okhttp-eventsource-stream-[]-0] INFO com.launchdarkly.sdk.server.LDClient.DataSource - Waiting 4075 milliseconds before reconnecting...
[okhttp-eventsource-stream-[]-0] ERROR com.launchdarkly.sdk.server.LDClient.DataSource - Error in stream connection (giving up permanently): HTTP error 401 (invalid SDK key)
[okhttp-eventsource-stream-[]-0] WARN com.launchdarkly.sdk.server.LDClient.DataSource - Connection unexpectedly closed
[okhttp-eventsource-stream-[]-0] WARN com.launchdarkly.sdk.server.LDClient.DataSource - Error in stream connection (will retry): java.io.EOFException
[okhttp-eventsource-stream-[]-0] INFO com.launchdarkly.sdk.server.LDClient.DataSource - Waiting 14868 milliseconds before reconnecting...
[okhttp-eventsource-stream-[]-0] ERROR com.launchdarkly.sdk.server.LDClient.DataSource - Error in stream connection (giving up permanently): HTTP error 401 (invalid SDK key)
...
SDK version
5.3.0
, but also confirmed the error exists on 5.5.0
Language version, developer tools
Java 8, IntelliJ 2021.1.1 Community Edition
OS/platform
MacBook Pro 2019
macOs Catalina
Version 10.15.7
Additional context
This causes us concern since we have a lot of long-running sites that may begin to operate on a stale SDK key if we rotate the SDK key. While this is mostly fine, since the sites will just stop getting updates from LaunchDarkly, the excessive logging could be a problem for our disk space utilization and logging quality.
Whoops, yes, this is definitely a bug. As you guessed, the code is supposed to treat a 401 as a permanent failure (since an invalid SDK key can never become valid again) and something in that logic is clearly broken now. Which I guess means the test coverage for it is also wrong somehow, since I'm fairly sure we have unit tests specifically for this. Investigating.
Indeed, we had a test, but it wasn't a valid test. The way we were setting up mock server responses wasn't adequately realisticโ we're currently working on implementing better end-to-end tests to catch just this sort of thing. Anyway, sorry about this and it should be a straightforward fix; we're working on it now.
Version 5.5.1 has been released and should fix this.
And thanks for bringing it to our attention.
@eli-darkly We are using launchdarkly-java-server-sdk 5.6.3
and are still seeing multiple attempts being made to post diagnostics, even though the requests fail with a 401 because of an invalid SDK key.
Should I open a new bug report or am I missing something?
@frederikvanhoyweghen That looks like it is a different bugโ that is, it's the same kind of issue, but it's in a different component that was not what the original bug report was about. If you wouldn't mind filing a separate issue, that'd be helpful.