sixhours-team/memcached-spring-boot

Unable to refresh DNS resolution

dopse opened this issue · 8 comments

dopse commented

Hello,

I use the library with AWS context and I got a problem based on DNS resolution.

I have declared a CNAME in route53 (e.g cache.demo) pointing on memcached configuration endpoint (e.g cache-demo-1-t3-small.n4bucj.cfg.euw1.cache.amazonaws.com:11211).

In configuration, I set :

memcached:
  cache:
    servers: cache.demo:11211

Everything is working. But when I decide to change cluster size or type, endpoint behind cache.demo change and can be cache-demo-1-t3-medium.n4bucj.cfg.euw1.cache.amazonaws.com:11211

Route53 record is update with new endpoint configuration but library continue to resolve old configuration endpoint. I tried to apply this https://docs.aws.amazon.com/fr_fr/sdk-for-java/v1/developer-guide//java-dg-jvm-ttl.html but it doesn't work because I think cache.demo DNS resolution is never done again due to inetSocketAddress given in memcachedClient.

Failed mode give old configuration :

- [Heal-Session-Thread] ERROR com.google.code.yanf4j.core.impl.AbstractController - Reconnected to cache-demo-1-t3-small.n4bucj.0001.euw1.cache.amazonaws.com/172.28.9.214:11211 fail

I have to kill container to force relaunch and refreshScope is not a good solution because it's an endpoint in actuator and if you have 6 containers and ELB in front, you can't refresh easily all.

Have you any idea to refresh DNS resolution from configuration ?

Thank you.

Regards,

At the moment the only option is to reload Spring context using the refresh actuator endpoint.

How often do you need to update cache configuration (e.g. small -> medium)? Do you need to change the cluster size as well?

There might be a way to expose additional configuration option for the AWS. We might have it ready next week within 2.1.3-SNAPSHOT version.

dopse commented

We are based on IAC. We can change cache configuration due to charge or price several times in a day depends of context.

I think, finding a way to refresh context if impossible to reconnect instance several times can be a good way. Application should repair itself without calling explicitly refresh endpoint.

I looked forward to your progress on this work and I congratulate you on the quality of this library.

@dopse The 2.2.0-SNAPSHOT version with the new configuration property is available. The new property is memcached.cache.servers-refresh-interval which will define the interval the AWS servers list will be refreshed for the configuration endpoint specified with memcached.cache.server.

This value defaults to 1 minute, but you can tune it per your needs with milliseconds precision e.g.

memcached:
  cache:
    servers: cache.demo:11211
    servers-refresh-interval: 30s
    ...

which would refresh the AWS configuration every 30 seconds. If you do not specify the unit (e.g. s for seconds, or m for minutes), it will assume the milliseconds.

The snapshot repository URL is https://oss.jfrog.org/artifactory/libs-snapshot. You can check how to set the snapshot repository with Gradle here. In case you are using Maven, it shouldn't be too difficult to set it in similar way.

Do let me know if this can solve the DNS issue you are having, and if yes we can release the new version this week.

dopse commented

Thank you for your quick development. I will try this quickly but I think it will not works.

In class AWSElasticCacheClientBuilder you have :

private long pollConfigIntervalMs = AWSElasticCacheClient.DEFAULT_POLL_CONFIG_INTERVAL_MS;

where DEFAULT_POLL_CONFIG_INTERVAL_MS = 60000 in previous version. So, if I didn't miss anything from your pull request, we have the same behavior and it doesn't works with this 😞

There is a cache maybe on InetSocketAddress and tests that I carried out refresh endpoint do the work to get new context and address.

Maybe we can call explicitly contextRefresher.refresh() from a scheduler when InetSocketAddress based on memcached.cache.servers change ?

The value you specify with memcached.cache.servers-refresh-interval should override the default polling interval of 1 minute. Try to specify some lower value e.g. 5s and give it a try.

dopse commented

I undestand what you mean but it doesn't change the behavior.

Here is an example of something working, maybe it can be a base :

package com.socgen.digital.agence.gix.backend.config;

import io.sixhours.memcached.cache.SocketAddress;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.cloud.context.refresh.ContextRefresher;
import org.springframework.context.annotation.Configuration;
import org.springframework.scheduling.annotation.Scheduled;
import org.springframework.util.StringUtils;

import java.net.InetSocketAddress;
import java.util.HashSet;
import java.util.List;
import java.util.stream.Collectors;
import java.util.stream.Stream;

@Configuration
public class MemcachedConfiguration {

    private static final Logger logger = LoggerFactory.getLogger(MemcachedConfiguration.class);

    private final String servers;

    private final ContextRefresher contextRefresher;

    private List<InetSocketAddress> currentServers;

    public MemcachedConfiguration(@Value("${memcached.cache.servers}") String servers, ContextRefresher contextRefresher) {
        this.currentServers = getInetSocketAddress(servers);
        this.contextRefresher = contextRefresher;
        this.servers = servers;
    }

    @Scheduled(fixedRate = 60000)
    public void refresh() {
        List<InetSocketAddress> newServers = getInetSocketAddress(servers);
        logger.debug("old configuration : {}, new configuration : {}", currentServers.stream().map(InetSocketAddress::toString).collect(Collectors.joining(",")),
                newServers.stream().map(InetSocketAddress::toString).collect(Collectors.joining(",")));
        if(!listEqualsIgnoreOrder(currentServers, newServers)) {
            logger.debug("memcached configuration update !");
            contextRefresher.refresh();
            currentServers = newServers;
        }
    }

    private List<InetSocketAddress> getInetSocketAddress(String servers) {
        if (StringUtils.isEmpty(servers)) {
            throw new IllegalArgumentException("Server list is empty");
        }
        return Stream.of(servers.split("\\s*,\\s*"))
                .map(SocketAddress::new)
                .map(SocketAddress::value)
                .collect(Collectors.toList());
    }

    private boolean listEqualsIgnoreOrder(List<InetSocketAddress> list1, List<InetSocketAddress> list2) {
        return new HashSet<>(list1).equals(new HashSet<>(list2));
    }

}

I have obliviously misunderstood the issue you are having.

Yes, the code you posted should solve the issue when changing DNS mapping in Route 53 to a new configuration address.

When you start your application the cache manager (using the AWSElasticCacheClient in your case) will be created as singleton bean with memcached.cache.servers already being resolved to the IP address + port per the hostname and port you provide in your configuration file. If you update your DNS mapping during runtime, the only option currently is to refresh Spring Boot context, as you've done in the code above.

Unfortunately, I don't see that this is something that the library should handle, since it's more infrastructure specific and it really depends on case-to-case basis how the users of the library are handling their infrastructure changes. This is the main purpose why we provided the MemcachedCacheManager as @RefreshScope bean i.e. to give the users an option to reload the cache manager bean during runtime, in case their infrastructure configuration is changed.

At the moment, none of the memcached client libraries supports the functionality you've mentioned (e.g. AWS ElastiCache Client and XMemcached). What I would recommend is to open an issue or even a pull-request with XMemcached (the underlying library we are using) and try to add this option inside the client itself. In that way you would not need to refresh the Spring Context and the client configuration address would be adapted per some polling interval, in a similar way it's already done for the AWS ElastiCache cache node list changes by polling the configuration endpoint.

dopse commented

I totally agree with you.

So I created a custom configuration based on the class above that communicates with route53 to have the correct version of the DNS record, InetSocketAddress resolving an underlying address.

Thanks for your help.