/resilience4j

Resilience4j is a fault tolerance library designed for Java8 and functional programming

Primary LanguageJavaApache License 2.0Apache-2.0

Fault tolerance library designed for functional programming

Build Status Quality Gate Coverage download Apache License 2

Introduction

Resilience4j is a lightweight fault tolerance library inspired by Netflix Hystrix, but designed for Java 8 and functional programming. Lightweight, because the library only uses Vavr (formerly Javaslang), which does not have any other external library dependencies. Netflix Hystrix, in contrast, has a compile dependency to Archaius which has many more external library dependencies such as Guava and Apache Commons Configuration.
With Resilience4j you don’t have to go all-in, you can pick what you need.

Documentation

Setup and usage is described in our User Guide.

Overview

Resilience provides several core modules and add-on modules:

Core modules:

  • resilience4j-circuitbreaker: Circuit breaking

  • resilience4j-ratelimiter: Rate limiting

  • resilience4j-bulkhead: Bulkheading

  • resilience4j-retry: Automatic retrying (sync and async)

  • resilience4j-cache: Response caching

  • resilience4j-timelimiter: Timeout handling

Add-on modules

  • resilience4j-reactor: Custom Spring Reactor operator

  • resilience4j-rxjava2: Custom RxJava2 operator

  • resilience4j-micrometer: Micrometer Metrics exporter

  • resilience4j-metrics: Dropwizard Metrics exporter

  • resilience4j-prometheus: Prometheus Metrics exporter

  • resilience4j-spring-boot: Spring Boot Starter

  • resilience4j-spring-boot2: Spring Boot 2 Starter

  • resilience4j-ratpack: Ratpack Starter

  • resilience4j-retrofit: Retrofit adapter

  • resilience4j-feign: Feign adapter

  • resilience4j-vertx: Vertx Future decorator

  • resilience4j-consumer: Circular Buffer Event consumer

To highlight a few differences to Netflix Hystrix:

  • In Hystrix calls to external systems have to be wrapped in a HystrixCommand. This library, in contrast, provides higher-order functions (decorators) to enhance any functional interface, lambda expression or method reference with a Circuit Breaker, Rate Limiter or Bulkhead. Furthermore, the library provides decorators to retry failed calls or cache call results. You can stack more than one decorator on any functional interface, lambda expression or method reference. That means, you can combine a Bulkhead, RateLimiter and Retry decorator with a CircuitBreaker decorator. The advantage is that you have the choice to select the decorator you need and nothing else. Any decorated function can be executed synchronously or asynchronously by using a CompletableFuture or RxJava.

  • Hystrix, by default, stores execution results in 10 1-second window buckets. If a 1-second window bucket is passed, a new bucket is created and the oldest is dropped. This library stores execution results in Ring Bit Buffer without a statistical rolling time window. A successful call is stored as a 0 bit and a failed call is stored as a 1 bit. The Ring Bit Buffer has a configurable fixed-size and stores the bits in a long[] array which is saving memory compared to a boolean array. That means the Ring Bit Buffer only needs an array of 16 long (64-bit) values to store the status of 1024 calls. The advantage is that this CircuitBreaker works out-of-the-box for low and high frequency backend systems, because execution results are not dropped when a time window is passed.

  • Hystrix only performs a single execution when in half-open state to determine whether to close a CircuitBreaker. This library allows to perform a configurable number of executions and compares the result against a configurable threshold to determine whether to close a CircuitBreaker.

  • This library provides custom RxJava operators to decorate any Observable or Flowable with a Circuit Breaker, Bulkhead or Ratelimiter.

  • Hystrix and this library emit a stream of events which are useful to system operators to monitor metrics about execution outcomes and latency.

Spring Boot demo

Setup and usage in Spring Boot 2 is demonstrated here.

Usage examples

CircuitBreaker, Retry and Fallback

The following example shows how to decorate a lambda expression (Supplier) with a CircuitBreaker and how to retry the call at most 3 times when an exception occurs.
You can configure the wait interval between retries and also configure a custom backoff algorithm.
The example uses Vavr’s Try Monad to recover from an exception and invoke another lambda expression as a fallback, when even all retries have failed.

// Simulates a Backend Service
public interface BackendService {
    String doSomething();
}

// Create a CircuitBreaker (use default configuration)
CircuitBreaker circuitBreaker = CircuitBreaker.ofDefaults("backendName");
// Create a Retry with at most 3 retries and a fixed time interval between retries of 500ms
Retry retry = Retry.ofDefaults("backendName");

// Decorate your call to BackendService.doSomething() with a CircuitBreaker
Supplier<String> decoratedSupplier = CircuitBreaker
    .decorateSupplier(circuitBreaker, backendService::doSomething);

// Decorate your call with automatic retry
decoratedSupplier = Retry
    .decorateSupplier(retry, decoratedSupplier);

// Execute the decorated supplier and recover from any exception
String result = Try.ofSupplier(decoratedSupplier)
    .recover(throwable -> "Hello from Recovery").get();

// When you don't want to decorate your lambda expression,
// but just execute it and protect the call by a CircuitBreaker.
String result = circuitBreaker.executeSupplier(backendService::doSomething);

The CircuitBreaker provides an interface to monitor metrics.

CircuitBreaker.Metrics metrics = circuitBreaker.getMetrics();
// Returns the failure rate in percentage.
float failureRate = metrics.getFailureRate();
// Returns the current number of buffered calls.
int bufferedCalls = metrics.getNumberOfBufferedCalls();
// Returns the current number of failed calls.
int failedCalls = metrics.getNumberOfFailedCalls();

CircuitBreaker and RxJava

The following example shows how to decorate an Observable by using the custom RxJava operator.

CircuitBreaker circuitBreaker = CircuitBreaker.ofDefaults("testName");
Observable.fromCallable(backendService::doSomething)
    .lift(CircuitBreakerOperator.of(circuitBreaker))
Note
Resilience4j also provides RxJava operators for RateLimiter, Bulkhead and Retry. Find out more in our User Guide

CircuitBreaker and Reactor

The following example shows how to decorate a Mono by using the custom Reactor operator.

CircuitBreaker circuitBreaker = CircuitBreaker.ofDefaults("testName");
Mono.fromCallable(backendService::doSomething)
    .transform(CircuitBreakerOperator.of(circuitBreaker))
Note
Resilience4j also provides Reactor operators for RateLimiter and Bulkhead. Find out more in our User Guide

RateLimiter

The following example shows how to restrict the calling rate of some method to be not higher than 1 req/sec.

// Create a custom RateLimiter configuration
RateLimiterConfig config = RateLimiterConfig.custom()
    .timeoutDuration(Duration.ofMillis(100))
    .limitRefreshPeriod(Duration.ofSeconds(1))
    .limitForPeriod(1)
    .build();
// Create a RateLimiter
RateLimiter rateLimiter = RateLimiter.of("backendName", config);

// Decorate your call to BackendService.doSomething()
Supplier<String> restrictedSupplier = RateLimiter
    .decorateSupplier(rateLimiter, backendService::doSomething);

// First call is successful
Try<String> firstTry = Try.ofSupplier(restrictedSupplier);
assertThat(firstTry.isSuccess()).isTrue();

// Second call fails, because the call was not permitted
Try<String> secondTry = Try.of(restrictedSupplier);
assertThat(secondTry.isFailure()).isTrue();
assertThat(secondTry.getCause()).isInstanceOf(RequestNotPermitted.class);

The RateLimiter provides an interface to monitor the number of available permissions. The AtomicRateLimiter has some enhanced Metrics with some implementation specific details.

RateLimiter.Metrics metrics = rateLimiter.getMetrics();
int numberOfThreadsWaitingForPermission = metrics.getNumberOfWaitingThreads();
// Estimates count of available permissions. Can be negative if some permissions where reserved.
int availablePermissions = metrics.getAvailablePermissions();

AtomicRateLimiter atomicLimiter;
// Estimated time duration in nanos to wait for the next permission
long nanosToWaitForPermission = atomicLimiter.getNanosToWait();

You can also dynamically change some rate limiter configurations. Find out more in our User Guide

Bulkhead

The following example shows how to decorate a lambda expression with a Bulkhead. A Bulkhead can be used to limit the amount of parallel executions. This bulkhead abstraction should work well across a variety of threading and io models. It is based on a semaphore, and unlike Hystrix, does not provide "shadow" thread pool option.

Bulkhead bulkhead = Bulkhead.ofDefaults("backendName");

Supplier<String> supplier = Bulkhead.decorateSupplier(bulkhead, backendService::doSomething);

The Bulkhead provides an interface to monitor the current number of available concurrent calls.

int availableConcurrentCalls = bulkhead.getMetrics().getAvailableConcurrentCalls()

You can also dynamically change it’s configuration.

Cache

The following example shows how to decorate a lambda expression with a Cache abstraction. The cache abstraction puts the result of the lambda expression in a cache instance (JCache) and
tries to retrieve a previous cached result from the cache before it invokes the lambda expression.
If the cache retrieval from a distributed cache fails, the exception is taken care of and the lambda expression is called.

// Create a CacheContext by wrapping a JCache instance.
javax.cache.Cache<String, String> cacheInstance = Caching.getCache("cacheName", String.class, String.class);
Cache<String, String> cacheContext = Cache.of(cacheInstance);

// Decorate your call to BackendService.doSomething()
Function<String, String> cachedFunction = Cache.decorateSupplier(cacheContext, backendService::doSomething);
String value = cachedFunction.apply("testKey");

The Cache provides an interface to monitor cache hits/misses.

Cache.Metrics metrics = cacheContext.getMetrics();
long cacheHits = metrics.getNumberOfCacheHits;
long cacheMisses = metrics.getNumberOfCacheMisses();

Metrics

The following example shows how to decorate a lambda expression to measure metrics using Dropwizard Metrics.
The Timer counts the number of total calls, successful calls, failed calls and measures the rate and response time of successful calls.

// Create a Timer
Timer timer = Timer.of("backend");
Supplier<String> supplier = Timer.decorateSupplier(timer, backendService::doSomething);

The Timer provides an interface to monitor metrics.

// Retrieve Timer metrics
Timer.Metrics metrics = timer.getMetrics();
// Returns the number of total calls
long totalCalls = metrics.getNumberOfTotalCalls();
// Returns the number of successful calls
long successfulCalls = metrics.getNumberOfSuccessfulCalls();
// Returns the number of failed calls
long failedCalls = metrics.getNumberOfFailedCalls();

Consume emitted events

CircuitBreaker, RateLimiter, Cache and Retry components emit a stream of events which can be consumed.

CircuitBreaker example below:

A CircuitBreakerEvent can be a state transition, a circuit breaker reset, a successful call, a recorded error or an ignored error. All events contains additional information like event creation time and processing duration of the call. If you want to consume events, you have to register an event consumer.

circuitBreaker.getEventPublisher()
    .onSuccess(event -> logger.info(...))
    .onError(event -> logger.info(...))
    .onIgnoredError(event -> logger.info(...))
    .onReset(event -> logger.info(...))
    .onStateTransition(event -> logger.info(...));
// Or if you want to register a consumer listening to all events, you can do:
circuitBreaker.getEventPublisher()
    .onEvent(event -> logger.info(...));

You could use the CircularEventConsumer to store events in a circular buffer with a fixed capacity.

CircularEventConsumer<CircuitBreakerEvent> ringBuffer = new CircularEventConsumer<>(10);
circuitBreaker.getEventPublisher().onEvent(ringBuffer);
List<CircuitBreakerEvent> bufferedEvents = ringBuffer.getBufferedEvents()

You can use RxJava or Spring Reactor Adapters to convert the EventPublisher into a Reactive Stream. The advantage of a Reactive Stream is that you can use RxJava’s observeOn operator to specify a different Scheduler that the CircuitBreaker will use to send notifications to its observers/consumers.

RxJava2Adapter.toFlowable(circuitBreaker.getEventPublisher())
    .filter(event -> event.getEventType() == Type.ERROR)
    .cast(CircuitBreakerOnErrorEvent.class)
    .subscribe(event -> logger.info(...))
Note
You can also consume events from RateLimiter, Bulkhead, Cache and Retry. Find out more in our User Guide

Companies who use Resilience4j

  • Deutsche Telekom (In an application with over 400 million request per day)

  • AOL (In an application with low latency requirements)

  • Netpulse (In system with 40+ integrations)

  • wescale.de (In a B2B integration platform)

  • Topia (In an HR application built with microservices architecture)

  • Auto Trader Group plc (UK’s largest digital automotive marketplace)

  • PlayStation Network (Platform backend)

License

Copyright 2017 Robert Winkler and Bohdan Storozhuk

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.