Testing Spring Cloud Feign client resiliency using Resilience4J

Now that we’ve checked how to test your Eureka integrated Spring Cloud Feign clients, I wanna turn over to another really interesting topic, testing resiliency.

Some people say HTTP is bad for microservice communication because:

  • It’s a synchronous protocol
  • It’s prone to errors
  • There’s no retry built in
  • Tight coupling
  • etc.

You can find all kind of fancy thought why messaging is much better as a communication channel between your services.

I personally disagree. I think we shall use the right tool for the job and sometimes you just don’t need the properties of a messaging system because you’re just fine with HTTP. And while it’s really prone to errors, we can make it reliable if we want to.

In this article, I’ll show you how to test the reliability of your Feign clients using Spring Cloud OpenFeign and Resilience4J.

If you don’t have any XP with Resilience4J, don’t worry, I’ll explain everything you need.

Circuit breaking in a nutshell

Let me start with a quote from Martin Fowler:

It’s common for software systems to make remote calls to software running in different processes, probably on different machines across a network. One of the big differences between in-memory calls and remote calls is that remote calls can fail, or hang without a response until some timeout limit is reached. What’s worse if you have many callers on a unresponsive supplier, then you can run out of critical resources leading to cascading failures across multiple systems.

Martin Fowler – https://martinfowler.com/bliki/CircuitBreaker.html

Now, I’ll try to phrase it my way focusing only on HTTP. When your microservices talk to each other, a lot can happen with a synchronous HTTP communication that’s based on request/response.

The network can be down, the other system can be down, the other system can have issues responding because it’s lacking resources and so on. For that intermittent time window – while the other side recovers – you don’t want to keep getting failures but you want to detect that there’s an issue and stop trying to send traffic to that particular service for some time and let it recover.

After you waited some time, the next time you need to talk to that service, you’ll try to send your request again. If it has recovered, great, everything can get back to normal. If it’s still continuously failing, let’s stop sending traffic again.

It’s called circuit breaking cause the pattern simulates when an electric circuit is open or closed. When the circuit is open, nothing will go through and we will not even try to send traffic to a service. When it’s closed, the requests can go through.

Setting up circuit breaking for your clients with Resilience4J

Unfortunately there’s only a tiny bit of documentation available for configuring Resilience4J circuit breaker for Spring Cloud Feign clients.

Let’s look at a simple Feign client that’s supposed to validate a session:

@FeignClient(name = "user-session-service")
public interface UserSessionClient {
    @GetMapping("/user-sessions/validate")
    UserSessionValidatorResponse validateSession(@RequestParam UUID sessionId);
}

What happens if the user-session-service is continously failing? For example it’s responding with an HTTP 500 – Internal Server Error over and over again. In this case you’ll continously keep getting the errors in your app while you know it’ll take some time for the user-session-service to recover. What’s the point of retrying? Nothing.

Resilience4J circuit breakers will help just with that. First, let’s add the right dependency to the pom.xml or build.gradle, org.springframework.cloud:spring-cloud-starter-circuitbreaker-resilience4j.

// omitted

ext {
	set('springCloudVersion', "2020.0.4")
}

dependencies {
	implementation 'org.springframework.cloud:spring-cloud-starter-openfeign'
	implementation 'org.springframework.cloud:spring-cloud-starter-circuitbreaker-resilience4j'
        // omitted for simplicity
}

dependencyManagement {
	imports {
		mavenBom "org.springframework.cloud:spring-cloud-dependencies:${springCloudVersion}"
	}
}

// omitted

Then, let’s go to the application.properties of the app and enable Feign circuit breaking:

feign.circuitbreaker.enabled=true

This will run with default settings which we don’t want, let’s customize the circuit breaker for our case.

Create a new @Configuration class and we’ll use a Customizer:

@Configuration
public class FeignConfiguration implements FeignFormatterRegistrar {
    @Bean
    public Customizer<Resilience4JCircuitBreakerFactory> circuitBreakerFactoryCustomizer() {
        
    }
}

And then set up the CircuitBreakerConfig with the necessary parameters and set it for the proper Feign client call:

@Configuration
public class FeignConfiguration {
    @Bean
    public Customizer<Resilience4JCircuitBreakerFactory> circuitBreakerFactoryCustomizer() {
        CircuitBreakerConfig cbConfig = CircuitBreakerConfig.custom()
                .slidingWindowType(CircuitBreakerConfig.SlidingWindowType.COUNT_BASED)
                .slidingWindowSize(5)
                .failureRateThreshold(20.0f)
                .waitDurationInOpenState(Duration.ofSeconds(5))
                .permittedNumberOfCallsInHalfOpenState(5)
                .build();
        return resilience4JCircuitBreakerFactory -> resilience4JCircuitBreakerFactory.configure(builder ->
                builder.circuitBreakerConfig(cbConfig), "UserSessionClient#validateSession(UUID)");
    }
}

The configuration parameters are available here but let me cover this one quickly. We’ll sample the last 5 requests. If out of those 5 requests, 20% or more are failing, the circuit will open and no requests will be allowed to go through. Then, we’ll wait for 5 seconds for the other service to recover. After that’s passed, the circuit breaker will allow 5 more requests to go through to check whether the service has recovered or not, with the 20% or more failure rule.

Practical example: if I call the method on the Feign client 5 times and 1 or more requests are failing (20% rule), the circuit will open. Then 5 seconds afterwards the circuit gets closed (half-open to be perfectly accurate) and the traffic can go through once again.

Another trick here is the circuit breaker naming. You can provide the configuration through String circuit breaker names. In the example above, it’s UserSessionClient#validateSession(UUID).

By default, the circuit breakers are named after the Feign client interface name and the respective method signature. It’s generated with the Feign#configKey method.

Due to a bug in the currently released Spring Cloud OpenFeign version the naming doesn’t work properly. Instead of the name above, it generates HardCodedTarget#validateSession(UUID). It’s already fixed by this commit though.

Anyway, while that’s released, we can provide a custom version of name generator that’s doing exactly that.

@Configuration
public class FeignConfiguration {
    @Bean
    public Customizer<Resilience4JCircuitBreakerFactory> circuitBreakerFactoryCustomizer() {
        CircuitBreakerConfig cbConfig = CircuitBreakerConfig.custom()
                .slidingWindowType(CircuitBreakerConfig.SlidingWindowType.COUNT_BASED)
                .slidingWindowSize(5)
                .failureRateThreshold(20.0f)
                .waitDurationInOpenState(Duration.ofSeconds(5))
                .permittedNumberOfCallsInHalfOpenState(5)
                .build();
        return resilience4JCircuitBreakerFactory -> resilience4JCircuitBreakerFactory.configure(builder ->
                builder.circuitBreakerConfig(cbConfig), "UserSessionClient#validateSession(UUID)");
    }

    @Bean
    public CircuitBreakerNameResolver circuitBreakerNameResolver() {
        return (feignClientName, target, method) -> Feign.configKey(target.type(), method);
    }
}

After this, the proper name will be generated.

Integration testing time

Now, if you start up your app and did everything right, after trying to invoke the API 5 times and having at least 1 failure, the following message will look back at you:

No fallback available.
org.springframework.cloud.client.circuitbreaker.NoFallbackAvailableException: No fallback available.
	at app//org.springframework.cloud.client.circuitbreaker.CircuitBreaker.lambda$run$0(CircuitBreaker.java:31)
	at app//io.vavr.control.Try.lambda$recover$6ea7267f$1(Try.java:949)
	at app//io.vavr.control.Try.of(Try.java:75)
	at app//io.vavr.control.Try.recover(Try.java:949)
	at app//org.springframework.cloud.circuitbreaker.resilience4j.Resilience4JCircuitBreaker.run(Resilience4JCircuitBreaker.java:123)
	...
	at app//worker.org.gradle.process.internal.worker.GradleWorkerMain.main(GradleWorkerMain.java:74)
Caused by: io.github.resilience4j.circuitbreaker.CallNotPermittedException: CircuitBreaker 'UserSessionClient#validateSession(UUID)' 
is OPEN and does not permit further calls
	at app//io.github.resilience4j.circuitbreaker.CallNotPermittedException.createCallNotPermittedException(CallNotPermittedException.java:48)
	at app//io.github.resilience4j.circuitbreaker.internal.CircuitBreakerStateMachine$OpenState.acquirePermission(CircuitBreakerStateMachine.java:689)
	at app//io.github.resilience4j.circuitbreaker.internal.CircuitBreakerStateMachine.acquirePermission(CircuitBreakerStateMachine.java:206)
	at app//io.github.resilience4j.circuitbreaker.CircuitBreaker.lambda$decorateCallable$3(CircuitBreaker.java:168)
	at app//io.vavr.control.Try.of(Try.java:75)
	... 89 more

Awesome. Now we’re talking.

Let’s write an integration test for this use-case to validate that the circuit breaker is working as we want it to work.

I’ll reuse the TestServiceInstaceListSupplier from my other article. If you haven’t read it, check it out.

Let’s start simple:

@SpringBootTest({"server.port:0", "eureka.client.enabled:false"})
public class CircuitBreakerTest {
    @TestConfiguration
    public static class TestConfig {
        @Bean
        public ServiceInstanceListSupplier serviceInstanceListSupplier() {
            return new TestServiceInstanceListSupplier("user-session-service", 8082);
        }
    }

    @RegisterExtension
    static WireMockExtension USER_SESSION_SERVICE = WireMockExtension.newInstance()
            .options(WireMockConfiguration.wireMockConfig().port(8082))
            .build();

    @Autowired
    private UserSessionClient userSessionClient;

    @Test
    public void testErrorBasedCircuitBreaking() throws Exception {
        String responseBody = "{ \"sessionId\": \"828bc3cb-52f0-482b-8247-d3db5c87c941\", \"valid\": true}";

        String uuidString = "828bc3cb-52f0-482b-8247-d3db5c87c941";
        UUID uuid = UUID.fromString(uuidString);

        USER_SESSION_SERVICE.stubFor(get(urlPathEqualTo("/user-sessions/validate"))
                .withQueryParam("sessionId", equalTo(uuidString))
                .willReturn(serverError()));
    }

}

Starting up a Spring Boot test on random port, turning off Eureka and providing the service list manually. Then starting up a WireMock server on port 8082, a simple test method with some initial data and a stubbing for the API that our Feign client will invoke.

As you can see, the mock server will respond with a serverError() which is an HTTP 500.

Let’s write the code to invoke the API:

@SpringBootTest({"server.port:0", "eureka.client.enabled:false"})
public class CircuitBreakerTest {
    @TestConfiguration
    public static class TestConfig {
        @Bean
        public ServiceInstanceListSupplier serviceInstanceListSupplier() {
            return new TestServiceInstanceListSupplier("user-session-service", 8082);
        }
    }

    @RegisterExtension
    static WireMockExtension USER_SESSION_SERVICE = WireMockExtension.newInstance()
            .options(WireMockConfiguration.wireMockConfig().port(8082))
            .build();

    @Autowired
    private UserSessionClient userSessionClient;

    @Test
    public void testErrorBasedCircuitBreaking() throws Exception {
        String responseBody = "{ \"sessionId\": \"828bc3cb-52f0-482b-8247-d3db5c87c941\", \"valid\": true}";

        String uuidString = "828bc3cb-52f0-482b-8247-d3db5c87c941";
        UUID uuid = UUID.fromString(uuidString);

        USER_SESSION_SERVICE.stubFor(get(urlPathEqualTo("/user-sessions/validate"))
                .withQueryParam("sessionId", equalTo(uuidString))
                .willReturn(serverError()));

        NoFallbackAvailableException noFallbackAvailableException = catchThrowableOfType(() -> userSessionClient.validateSession(uuid), NoFallbackAvailableException.class);
        FeignException feignException = (FeignException) noFallbackAvailableException.getCause().getCause();
        assertThat(feignException.status()).isEqualTo(500);
    }

}

Obviously since the Feign client doesn’t have any fallbacks associated with it, Spring will throw a NoFallbackAvailableException instead which contains the regular FeignException as a cause, from which we can check the status code of the API call. It should be an HTTP 500. Note that the cause is double wrapped so it has to be unpacked twice.

Now, out of 5 requests, we already have one failure meaning that even if the 4 remaining API calls will succeed, the circuit breaker will still open. Let’s do exactly that.

@SpringBootTest({"server.port:0", "eureka.client.enabled:false"})
public class CircuitBreakerTest {
    @TestConfiguration
    public static class TestConfig {
        @Bean
        public ServiceInstanceListSupplier serviceInstanceListSupplier() {
            return new TestServiceInstanceListSupplier("user-session-service", 8082);
        }
    }

    @RegisterExtension
    static WireMockExtension USER_SESSION_SERVICE = WireMockExtension.newInstance()
            .options(WireMockConfiguration.wireMockConfig().port(8082))
            .build();

    @Autowired
    private UserSessionClient userSessionClient;

    @Test
    public void testErrorBasedCircuitBreaking() throws Exception {
        String responseBody = "{ \"sessionId\": \"828bc3cb-52f0-482b-8247-d3db5c87c941\", \"valid\": true}";

        String uuidString = "828bc3cb-52f0-482b-8247-d3db5c87c941";
        UUID uuid = UUID.fromString(uuidString);

        USER_SESSION_SERVICE.stubFor(get(urlPathEqualTo("/user-sessions/validate"))
                .withQueryParam("sessionId", equalTo(uuidString))
                .willReturn(serverError()));

        NoFallbackAvailableException noFallbackAvailableException = catchThrowableOfType(() -> userSessionClient.validateSession(uuid), NoFallbackAvailableException.class);
        FeignException feignException = (FeignException) noFallbackAvailableException.getCause().getCause();
        assertThat(feignException.status()).isEqualTo(500);

        USER_SESSION_SERVICE.stubFor(get(urlPathEqualTo("/user-sessions/validate"))
                .withQueryParam("sessionId", equalTo(uuidString))
                .willReturn(okJson(responseBody)));

        for (int i = 0; i<4; i++) {
            userSessionClient.validateSession(uuid);
        }
    }

}

The mock server is reconfigured to respond with HTTP 200 OK and to return the proper JSON body, then the API is invoked 4 times.

We reached 5 API calls, now the circuit will open upon the next invocation.

@SpringBootTest({"server.port:0", "eureka.client.enabled:false"})
public class CircuitBreakerTest {
    @TestConfiguration
    public static class TestConfig {
        @Bean
        public ServiceInstanceListSupplier serviceInstanceListSupplier() {
            return new TestServiceInstanceListSupplier("user-session-service", 8082);
        }
    }

    @RegisterExtension
    static WireMockExtension USER_SESSION_SERVICE = WireMockExtension.newInstance()
            .options(WireMockConfiguration.wireMockConfig().port(8082))
            .build();

    @Autowired
    private UserSessionClient userSessionClient;

    @Test
    public void testErrorBasedCircuitBreaking() throws Exception {
        String responseBody = "{ \"sessionId\": \"828bc3cb-52f0-482b-8247-d3db5c87c941\", \"valid\": true}";

        String uuidString = "828bc3cb-52f0-482b-8247-d3db5c87c941";
        UUID uuid = UUID.fromString(uuidString);

        USER_SESSION_SERVICE.stubFor(get(urlPathEqualTo("/user-sessions/validate"))
                .withQueryParam("sessionId", equalTo(uuidString))
                .willReturn(serverError()));

        NoFallbackAvailableException noFallbackAvailableException = catchThrowableOfType(() -> userSessionClient.validateSession(uuid), NoFallbackAvailableException.class);
        FeignException feignException = (FeignException) noFallbackAvailableException.getCause().getCause();
        assertThat(feignException.status()).isEqualTo(500);

        USER_SESSION_SERVICE.stubFor(get(urlPathEqualTo("/user-sessions/validate"))
                .withQueryParam("sessionId", equalTo(uuidString))
                .willReturn(okJson(responseBody)));

        for (int i = 0; i<4; i++) {
            userSessionClient.validateSession(uuid);
        }

        NoFallbackAvailableException anotherNoFallbackAvailableException = catchThrowableOfType(() -> userSessionClient.validateSession(uuid), NoFallbackAvailableException.class);
        CallNotPermittedException callNotPermittedException = (CallNotPermittedException) anotherNoFallbackAvailableException.getCause();
        assertThat(callNotPermittedException).isNotNull();
    }

}

In this case, we’ll also get a NoFallbackAvailableException from Spring although this time it will not contain a FeignException as a cause but a CallnotPermittedException. Note that in this case the cause is only wrapped once, so a single getCause() method call is enough to unwrap.

We have the circuit open, let’s wait for the configured time, 5 seconds.

@SpringBootTest({"server.port:0", "eureka.client.enabled:false"})
public class CircuitBreakerTest {
    @TestConfiguration
    public static class TestConfig {
        @Bean
        public ServiceInstanceListSupplier serviceInstanceListSupplier() {
            return new TestServiceInstanceListSupplier("user-session-service", 8082);
        }
    }

    @RegisterExtension
    static WireMockExtension USER_SESSION_SERVICE = WireMockExtension.newInstance()
            .options(WireMockConfiguration.wireMockConfig().port(8082))
            .build();

    @Autowired
    private UserSessionClient userSessionClient;

    @Test
    public void testErrorBasedCircuitBreaking() throws Exception {
        String responseBody = "{ \"sessionId\": \"828bc3cb-52f0-482b-8247-d3db5c87c941\", \"valid\": true}";

        String uuidString = "828bc3cb-52f0-482b-8247-d3db5c87c941";
        UUID uuid = UUID.fromString(uuidString);

        USER_SESSION_SERVICE.stubFor(get(urlPathEqualTo("/user-sessions/validate"))
                .withQueryParam("sessionId", equalTo(uuidString))
                .willReturn(serverError()));

        NoFallbackAvailableException noFallbackAvailableException = catchThrowableOfType(() -> userSessionClient.validateSession(uuid), NoFallbackAvailableException.class);
        FeignException feignException = (FeignException) noFallbackAvailableException.getCause().getCause();
        assertThat(feignException.status()).isEqualTo(500);

        USER_SESSION_SERVICE.stubFor(get(urlPathEqualTo("/user-sessions/validate"))
                .withQueryParam("sessionId", equalTo(uuidString))
                .willReturn(okJson(responseBody)));

        for (int i = 0; i<4; i++) {
            userSessionClient.validateSession(uuid);
        }

        NoFallbackAvailableException anotherNoFallbackAvailableException = catchThrowableOfType(() -> userSessionClient.validateSession(uuid), NoFallbackAvailableException.class);
        CallNotPermittedException callNotPermittedException = (CallNotPermittedException) anotherNoFallbackAvailableException.getCause();
        assertThat(callNotPermittedException).isNotNull();

        Thread.sleep(5100);
    }

}

I intentionally wait for 5.1 seconds to be on the safe side. I don’t want random test failures due to some timing issues.

Then let’s call the API 6 more times to verify that the circuit breaker will not get open again after the 5 sampling window.

@SpringBootTest({"server.port:0", "eureka.client.enabled:false"})
public class CircuitBreakerTest {
    @TestConfiguration
    public static class TestConfig {
        @Bean
        public ServiceInstanceListSupplier serviceInstanceListSupplier() {
            return new TestServiceInstanceListSupplier("user-session-service", 8082);
        }
    }

    @RegisterExtension
    static WireMockExtension USER_SESSION_SERVICE = WireMockExtension.newInstance()
            .options(WireMockConfiguration.wireMockConfig().port(8082))
            .build();

    @Autowired
    private UserSessionClient userSessionClient;

    @Test
    public void testErrorBasedCircuitBreaking() throws Exception {
        String responseBody = "{ \"sessionId\": \"828bc3cb-52f0-482b-8247-d3db5c87c941\", \"valid\": true}";

        String uuidString = "828bc3cb-52f0-482b-8247-d3db5c87c941";
        UUID uuid = UUID.fromString(uuidString);

        USER_SESSION_SERVICE.stubFor(get(urlPathEqualTo("/user-sessions/validate"))
                .withQueryParam("sessionId", equalTo(uuidString))
                .willReturn(serverError()));

        NoFallbackAvailableException noFallbackAvailableException = catchThrowableOfType(() -> userSessionClient.validateSession(uuid), NoFallbackAvailableException.class);
        FeignException feignException = (FeignException) noFallbackAvailableException.getCause().getCause();
        assertThat(feignException.status()).isEqualTo(500);

        USER_SESSION_SERVICE.stubFor(get(urlPathEqualTo("/user-sessions/validate"))
                .withQueryParam("sessionId", equalTo(uuidString))
                .willReturn(okJson(responseBody)));

        for (int i = 0; i<4; i++) {
            userSessionClient.validateSession(uuid);
        }

        NoFallbackAvailableException anotherNoFallbackAvailableException = catchThrowableOfType(() -> userSessionClient.validateSession(uuid), NoFallbackAvailableException.class);
        CallNotPermittedException callNotPermittedException = (CallNotPermittedException) anotherNoFallbackAvailableException.getCause();
        assertThat(callNotPermittedException).isNotNull();

        Thread.sleep(5100);

        for (int i = 0; i<6; i++) {
            userSessionClient.validateSession(uuid);
        }
    }

}

Simply call the Feign client 6 times and if all suceeds, the circuit breaker is working properly.

Easy and effective, right? You don’t want to test these things manually because they’re complex and there’s a lot of room for human error during testing.

There’s a very similar test case on my GitHub. If you liked the article, make sure to share it with your fellow developers, and check me out on Facebook and Twitter.

Also, if you wanna know more about Feign, Spring Cloud OpenFeign, Eureka integration and Resilicence4J integration, check out my brand new course: Mastering microservice communication with Spring Cloud Feign.

2 Replies to “Testing Spring Cloud Feign client resiliency using Resilience4J”

  1. guest says:
    1. Arnold Galovics says:

Leave a Reply

Your email address will not be published. Required fields are marked *