optimizely/java-sdk

PollingProjectConfigManager doesn't save the new config

stevenenen opened this issue · 5 comments

Hi,

I discovered a bug in the Java SDK of optimizely.

Reproduction test:

1) optimizely feature flag = true
2) decide() -> returns true
3) optimizely feature flag = false
4) decide() -> returns false
5) decide() -> suddenly returns true again (bug)

Bug location:

com.optimizely.ab.config.PollingProjectConfigManager;

    @Override
    public ProjectConfig getConfig() {
        if (started) {
            try {
                boolean acquired = countDownLatch.await(blockingTimeoutPeriod, blockingTimeoutUnit);
                if (!acquired) {
                    logger.warn("Timeout exceeded waiting for ProjectConfig to be set, returning null.");
                    countDownLatch.countDown();
                }

            } catch (InterruptedException e) {
                logger.warn("Interrupted waiting for valid ProjectConfig, returning null.");
            }

            return currentProjectConfig.get();
        }

        ProjectConfig projectConfig = poll();
        return projectConfig == null ? currentProjectConfig.get() : projectConfig;
    }

The problem is in the last two lines. It polls for a new config, if there was a difference since last call then a projectConfig is returned with the up-to-date config. However: currentProjectConfig is not updated! So the next time this method is called, it returns the original projectConfig again that was loaded on application start!

This specifically fails with the HttpProjectConfigManager implementation. Because under water, it does an HTTP call, which might return HTTP/1.1 304 Not Modified.

I've created this work around, it's a copy from the original HttpProjectConfigManager with a minor change:

public class BugFixHttpProjectConfigManager extends PollingProjectConfigManager {

    public static final String CONFIG_POLLING_DURATION  = "http.project.config.manager.polling.duration";
    public static final String CONFIG_POLLING_UNIT      = "http.project.config.manager.polling.unit";
    public static final String CONFIG_BLOCKING_DURATION = "http.project.config.manager.blocking.duration";
    public static final String CONFIG_BLOCKING_UNIT     = "http.project.config.manager.blocking.unit";
    public static final String CONFIG_EVICT_DURATION    = "http.project.config.manager.evict.duration";
    public static final String CONFIG_EVICT_UNIT        = "http.project.config.manager.evict.unit";
    public static final String CONFIG_SDK_KEY           = "http.project.config.manager.sdk.key";
    public static final String CONFIG_DATAFILE_AUTH_TOKEN = "http.project.config.manager.datafile.auth.token";

    public static final long DEFAULT_POLLING_DURATION  = 5;
    public static final TimeUnit DEFAULT_POLLING_UNIT  = TimeUnit.MINUTES;
    public static final long DEFAULT_BLOCKING_DURATION = 10;
    public static final TimeUnit DEFAULT_BLOCKING_UNIT = TimeUnit.SECONDS;
    public static final long DEFAULT_EVICT_DURATION    = 1;
    public static final TimeUnit DEFAULT_EVICT_UNIT    = TimeUnit.MINUTES;

    private static final Logger logger = LoggerFactory.getLogger(BugFixHttpProjectConfigManager.class);

    private final OptimizelyHttpClient httpClient;
    private final URI uri;
    private final String datafileAccessToken;
    private String datafileLastModified;

    private BugFixHttpProjectConfigManager(long period,
                                           TimeUnit timeUnit,
                                           OptimizelyHttpClient httpClient,
                                           String url,
                                           String datafileAccessToken,
                                           long blockingTimeoutPeriod,
                                           TimeUnit blockingTimeoutUnit,
                                           NotificationCenter notificationCenter) {
        super(period, timeUnit, blockingTimeoutPeriod, blockingTimeoutUnit, notificationCenter);
        this.httpClient = httpClient;
        this.uri = URI.create(url);
        this.datafileAccessToken = datafileAccessToken;
    }

    public URI getUri() {
        return uri;
    }

    public String getLastModified() {
        return datafileLastModified;
    }

    public String getDatafileFromResponse(HttpResponse response) throws NullPointerException, IOException {
        StatusLine statusLine = response.getStatusLine();

        if (statusLine == null) {
            throw new ClientProtocolException("unexpected response from event endpoint, status is null");
        }

        int status = statusLine.getStatusCode();

        // Datafile has not updated
        if (status == HttpStatus.SC_NOT_MODIFIED) {
            logger.debug("Not updating ProjectConfig as datafile has not updated since " + datafileLastModified);
            return null;
        }

        if (status >= 200 && status < 300) {
            // read the response, so we can close the connection
            HttpEntity entity = response.getEntity();
            Header lastModifiedHeader = response.getFirstHeader(HttpHeaders.LAST_MODIFIED);
            if (lastModifiedHeader != null) {
                datafileLastModified = lastModifiedHeader.getValue();
            }
            return EntityUtils.toString(entity, "UTF-8");
        } else {
            throw new ClientProtocolException("unexpected response when trying to fetch datafile, status: " + status);
        }
    }

    static ProjectConfig parseProjectConfig(String datafile) throws ConfigParseException {
        return new DatafileProjectConfig.Builder().withDatafile(datafile).build();
    }

    @Override
    public ProjectConfig getConfig() {
        var config = super.getConfig();
        setConfig(config);
        return config;
    }

    @Override
    protected ProjectConfig poll() {
        HttpGet httpGet = createHttpRequest();
        CloseableHttpResponse response = null;
        logger.debug("Fetching datafile from: {}", httpGet.getURI());
        try {
            response = httpClient.execute(httpGet);
            String datafile = getDatafileFromResponse(response);
            if (datafile == null) {
                return null;
            }
            return parseProjectConfig(datafile);
        } catch (ConfigParseException | IOException e) {
            logger.error("Error fetching datafile", e);
        }
        finally {
            if (response != null) {
                try {
                    response.close();
                } catch (IOException e) {
                    logger.warn(e.getLocalizedMessage());
                }
            }
        }

        return null;
    }

    @VisibleForTesting
    HttpGet createHttpRequest() {
        HttpGet httpGet = new HttpGet(uri);

        if (datafileAccessToken != null) {
            httpGet.setHeader(HttpHeaders.AUTHORIZATION, "Bearer " + datafileAccessToken);
        }

        if (datafileLastModified != null) {
            httpGet.setHeader(HttpHeaders.IF_MODIFIED_SINCE, datafileLastModified);
        }

        return httpGet;
    }

    public static Builder builder() {
        return new Builder();
    }

    public static class Builder {
        private String datafile;
        private String url;
        private String datafileAccessToken = null;
        private String format = "https://cdn.optimizely.com/datafiles/%s.json";
        private String authFormat = "https://config.optimizely.com/datafiles/auth/%s.json";
        private OptimizelyHttpClient httpClient;
        private NotificationCenter notificationCenter;

        String sdkKey = PropertyUtils.get(CONFIG_SDK_KEY);
        long period = PropertyUtils.getLong(CONFIG_POLLING_DURATION, DEFAULT_POLLING_DURATION);
        TimeUnit timeUnit = PropertyUtils.getEnum(CONFIG_POLLING_UNIT, TimeUnit.class, DEFAULT_POLLING_UNIT);

        long blockingTimeoutPeriod = PropertyUtils.getLong(CONFIG_BLOCKING_DURATION, DEFAULT_BLOCKING_DURATION);
        TimeUnit blockingTimeoutUnit = PropertyUtils.getEnum(CONFIG_BLOCKING_UNIT, TimeUnit.class, DEFAULT_BLOCKING_UNIT);

        // force-close the persistent connection after this idle time
        long evictConnectionIdleTimePeriod = PropertyUtils.getLong(CONFIG_EVICT_DURATION, DEFAULT_EVICT_DURATION);
        TimeUnit evictConnectionIdleTimeUnit = PropertyUtils.getEnum(CONFIG_EVICT_UNIT, TimeUnit.class, DEFAULT_EVICT_UNIT);

        public Builder withDatafile(String datafile) {
            this.datafile = datafile;
            return this;
        }

        public Builder withSdkKey(String sdkKey) {
            this.sdkKey = sdkKey;
            return this;
        }

        public Builder withDatafileAccessToken(String token) {
            this.datafileAccessToken = token;
            return this;
        }

        public Builder withUrl(String url) {
            this.url = url;
            return this;
        }

        public Builder withFormat(String format) {
            this.format = format;
            return this;
        }

        public Builder withOptimizelyHttpClient(OptimizelyHttpClient httpClient) {
            this.httpClient = httpClient;
            return this;
        }

        /**
         * Makes HttpClient proactively evict idle connections from theœ
         * connection pool using a background thread.
         *
         * @see org.apache.http.impl.client.HttpClientBuilder#evictIdleConnections(long, TimeUnit)
         *
         * @param maxIdleTime maximum time persistent connections can stay idle while kept alive
         * in the connection pool. Connections whose inactivity period exceeds this value will
         * get closed and evicted from the pool.  Set to 0 to disable eviction.
         * @param maxIdleTimeUnit time unit for the above parameter.
         *
         * @return  A HttpProjectConfigManager builder
         */
        public Builder withEvictIdleConnections(long maxIdleTime, TimeUnit maxIdleTimeUnit) {
            this.evictConnectionIdleTimePeriod = maxIdleTime;
            this.evictConnectionIdleTimeUnit = maxIdleTimeUnit;
            return this;
        }

        /**
         * Configure time to block before Completing the future. This timeout is used on the first call
         * to {@link PollingProjectConfigManager#getConfig()}. If the timeout is exceeded then the
         * PollingProjectConfigManager will begin returning null immediately until the call to Poll
         * succeeds.
         *
         * @param period    A timeout period
         * @param timeUnit  A timeout unit
         * @return  A HttpProjectConfigManager builder
         */
        public Builder withBlockingTimeout(Long period, TimeUnit timeUnit) {
            if (timeUnit == null) {
                logger.warn("TimeUnit cannot be null. Keeping default period: {} and time unit: {}", this.blockingTimeoutPeriod, this.blockingTimeoutUnit);
                return this;
            }

            if (period == null) {
                logger.warn("Timeout cannot be null. Keeping default period: {} and time unit: {}", this.blockingTimeoutPeriod, this.blockingTimeoutUnit);
                return this;
            }

            if (period <= 0) {
                logger.warn("Timeout cannot be <= 0. Keeping default period: {} and time unit: {}", this.blockingTimeoutPeriod, this.blockingTimeoutUnit);
                return this;
            }

            this.blockingTimeoutPeriod = period;
            this.blockingTimeoutUnit = timeUnit;

            return this;
        }

        public Builder withPollingInterval(Long period, TimeUnit timeUnit) {
            if (timeUnit == null) {
                logger.warn("TimeUnit cannot be null. Keeping default period: {} and time unit: {}", this.period, this.timeUnit);
                return this;
            }

            if (period == null) {
                logger.warn("Interval cannot be null. Keeping default period: {} and time unit: {}", this.period, this.timeUnit);
                return this;
            }

            if (period <= 0) {
                logger.warn("Interval cannot be <= 0. Keeping default period: {} and time unit: {}", this.period, this.timeUnit);
                return this;
            }

            this.period = period;
            this.timeUnit = timeUnit;

            return this;
        }

        public Builder withNotificationCenter(NotificationCenter notificationCenter) {
            this.notificationCenter = notificationCenter;
            return this;
        }

        /**
         * HttpProjectConfigManager.Builder that builds and starts a HttpProjectConfigManager.
         * This is the default builder which will block until a config is available.
         *
         * @return {@link BugFixHttpProjectConfigManager}
         */
        public BugFixHttpProjectConfigManager build() {
            return build(false);
        }

        /**
         * HttpProjectConfigManager.Builder that builds and starts a HttpProjectConfigManager.
         *
         * @param defer When true, we will not wait for the configuration to be available
         *              before returning the HttpProjectConfigManager instance.
         * @return {@link BugFixHttpProjectConfigManager}
         */
        public BugFixHttpProjectConfigManager build(boolean defer) {
            if (period <= 0) {
                logger.warn("Invalid polling interval {}, {}. Defaulting to {}, {}",
                    period, timeUnit, DEFAULT_POLLING_DURATION, DEFAULT_POLLING_UNIT);
                period = DEFAULT_POLLING_DURATION;
                timeUnit = DEFAULT_POLLING_UNIT;
            }

            if (blockingTimeoutPeriod <= 0) {
                logger.warn("Invalid polling interval {}, {}. Defaulting to {}, {}",
                    blockingTimeoutPeriod, blockingTimeoutUnit, DEFAULT_BLOCKING_DURATION, DEFAULT_BLOCKING_UNIT);
                blockingTimeoutPeriod = DEFAULT_BLOCKING_DURATION;
                blockingTimeoutUnit = DEFAULT_BLOCKING_UNIT;
            }

            if (httpClient == null) {
                httpClient = OptimizelyHttpClient.builder()
                    .withEvictIdleConnections(evictConnectionIdleTimePeriod, evictConnectionIdleTimeUnit)
                    .build();
            }

            if (url == null) {
                if (sdkKey == null) {
                    throw new NullPointerException("sdkKey cannot be null");
                }

                if (datafileAccessToken == null) {
                    url = String.format(format, sdkKey);
                } else {
                    url = String.format(authFormat, sdkKey);
                }
            }

            if (notificationCenter == null) {
                notificationCenter = new NotificationCenter();
            }

            BugFixHttpProjectConfigManager httpProjectManager = new BugFixHttpProjectConfigManager(
                period,
                timeUnit,
                httpClient,
                url,
                datafileAccessToken,
                blockingTimeoutPeriod,
                blockingTimeoutUnit,
                notificationCenter);

            if (datafile != null) {
                try {
                    ProjectConfig projectConfig = BugFixHttpProjectConfigManager.parseProjectConfig(datafile);
                    httpProjectManager.setConfig(projectConfig);
                } catch (ConfigParseException e) {
                    logger.warn("Error parsing fallback datafile.", e);
                }
            }

            httpProjectManager.start();

            // Optionally block until config is available.
            if (!defer) {
                httpProjectManager.getConfig();
            }

            return httpProjectManager;
        }
    }
}

the only change I made in the above code is:

    @Override
    public ProjectConfig getConfig() {
        var config = super.getConfig();
        setConfig(config);
        return config;
    }

And then it works correctly again, reproduction test results:

1) optimizely feature flag = true
2) decide() -> returns true
3) optimizely feature flag = false
4) decide() -> returns false
5) decide() -> returns false (bug fixed)

@stevenenen setConfig() is called to update currentProjectConfig here:

Your issue may be related to others. Can you share more context about the observation flow? I guess (3) means you update the project in app.optimizely.com. After you update the project, you called decide() 2 times back to back, and they return different results?

Hi @jaeopt , Thanks for your reply.

You're right that the Poller should set the config, saving the new response received from app.optimizely.com.

With step 3 I indeed meant a change from app.optimizely.com :

1) `app.optimizely.com` feature flag = true
2) decide() -> returns true
3) `app.optimizely.com` feature flag = false
4) decide() -> returns false
5) decide() -> suddenly returns true again (bug)
[...]
n) decide() -> suddenly returns true again (bug)

It will keep returning the value received on application start, UNTIL the poller has run successfully, so it will show flaky results until then. However, the poller in our case doesn't seem to work correctly, it runs a couple time during application start (I set polling duration very low), and then it simply stops.

But even when the poller would work correctly, the feature flag will have an invalid state until it run, that's not what I would expect :)

A simple solution might be: to use the same logic from the poller in the getConfig() method, so that 'whenever' a new config is polled, it's always persisted and gives consistent results.

@stevenenen Do you mind sharing your sdkKey and feature flag you're working with? Or you can submit a support ticket so we can follow up.

For the record here for others:
We shared a project which demonstrates the bug (https://github.com/RDBreed/optimizely_bug), but it is not acknowledged to be a bug (yet, but lost hope already). We will have to keep on using the (not so nice because of encapsulation of constructor etc.) fix proposed by @stevenenen, else we cannot use feature toggles in our application. It has nothing to do with our polling btw.

This is not a bug. Datafile was downloaded repeatedly because of the way SDK set up in the app.
Let us know if you see more issues. Closed for now.