Now that we’ve checked how to test your Eureka integrated Spring Cloud Feign clients, I wanna turn over to another really interesting topic, testing resiliency.
Some people say HTTP is bad for microservice communication because:
- It’s a synchronous protocol
- It’s prone to errors
- There’s no retry built in
- Tight coupling
- etc.
You can find all kind of fancy thought why messaging is much better as a communication channel between your services.
I personally disagree. I think we shall use the right tool for the job and sometimes you just don’t need the properties of a messaging system because you’re just fine with HTTP. And while it’s really prone to errors, we can make it reliable if we want to.
In this article, I’ll show you how to test the reliability of your Feign clients using Spring Cloud OpenFeign and Resilience4J.
If you don’t have any XP with Resilience4J, don’t worry, I’ll explain everything you need.
Circuit breaking in a nutshell
Let me start with a quote from Martin Fowler:
It’s common for software systems to make remote calls to software running in different processes, probably on different machines across a network. One of the big differences between in-memory calls and remote calls is that remote calls can fail, or hang without a response until some timeout limit is reached. What’s worse if you have many callers on a unresponsive supplier, then you can run out of critical resources leading to cascading failures across multiple systems.
Martin Fowler – https://martinfowler.com/bliki/CircuitBreaker.html
Now, I’ll try to phrase it my way focusing only on HTTP. When your microservices talk to each other, a lot can happen with a synchronous HTTP communication that’s based on request/response.
The network can be down, the other system can be down, the other system can have issues responding because it’s lacking resources and so on. For that intermittent time window – while the other side recovers – you don’t want to keep getting failures but you want to detect that there’s an issue and stop trying to send traffic to that particular service for some time and let it recover.
After you waited some time, the next time you need to talk to that service, you’ll try to send your request again. If it has recovered, great, everything can get back to normal. If it’s still continuously failing, let’s stop sending traffic again.
It’s called circuit breaking cause the pattern simulates when an electric circuit is open or closed. When the circuit is open, nothing will go through and we will not even try to send traffic to a service. When it’s closed, the requests can go through.
Setting up circuit breaking for your clients with Resilience4J
Unfortunately there’s only a tiny bit of documentation available for configuring Resilience4J circuit breaker for Spring Cloud Feign clients.
Let’s look at a simple Feign client that’s supposed to validate a session:
@FeignClient(name = "user-session-service") public interface UserSessionClient { @GetMapping("/user-sessions/validate") UserSessionValidatorResponse validateSession(@RequestParam UUID sessionId); }
What happens if the user-session-service
is continously failing? For example it’s responding with an HTTP 500 – Internal Server Error over and over again. In this case you’ll continously keep getting the errors in your app while you know it’ll take some time for the user-session-service
to recover. What’s the point of retrying? Nothing.
Resilience4J circuit breakers will help just with that. First, let’s add the right dependency to the pom.xml
or build.gradle
, org.springframework.cloud:spring-cloud-starter-circuitbreaker-resilience4j
.
// omitted ext { set('springCloudVersion', "2020.0.4") } dependencies { implementation 'org.springframework.cloud:spring-cloud-starter-openfeign' implementation 'org.springframework.cloud:spring-cloud-starter-circuitbreaker-resilience4j' // omitted for simplicity } dependencyManagement { imports { mavenBom "org.springframework.cloud:spring-cloud-dependencies:${springCloudVersion}" } } // omitted
Then, let’s go to the application.properties
of the app and enable Feign circuit breaking:
feign.circuitbreaker.enabled=true
This will run with default settings which we don’t want, let’s customize the circuit breaker for our case.
Create a new @Configuration
class and we’ll use a Customizer
:
@Configuration public class FeignConfiguration implements FeignFormatterRegistrar { @Bean public Customizer<Resilience4JCircuitBreakerFactory> circuitBreakerFactoryCustomizer() { } }
And then set up the CircuitBreakerConfig
with the necessary parameters and set it for the proper Feign client call:
@Configuration public class FeignConfiguration { @Bean public Customizer<Resilience4JCircuitBreakerFactory> circuitBreakerFactoryCustomizer() { CircuitBreakerConfig cbConfig = CircuitBreakerConfig.custom() .slidingWindowType(CircuitBreakerConfig.SlidingWindowType.COUNT_BASED) .slidingWindowSize(5) .failureRateThreshold(20.0f) .waitDurationInOpenState(Duration.ofSeconds(5)) .permittedNumberOfCallsInHalfOpenState(5) .build(); return resilience4JCircuitBreakerFactory -> resilience4JCircuitBreakerFactory.configure(builder -> builder.circuitBreakerConfig(cbConfig), "UserSessionClient#validateSession(UUID)"); } }
The configuration parameters are available here but let me cover this one quickly. We’ll sample the last 5 requests. If out of those 5 requests, 20% or more are failing, the circuit will open and no requests will be allowed to go through. Then, we’ll wait for 5 seconds for the other service to recover. After that’s passed, the circuit breaker will allow 5 more requests to go through to check whether the service has recovered or not, with the 20% or more failure rule.
Practical example: if I call the method on the Feign client 5 times and 1 or more requests are failing (20% rule), the circuit will open. Then 5 seconds afterwards the circuit gets closed (half-open to be perfectly accurate) and the traffic can go through once again.
Another trick here is the circuit breaker naming. You can provide the configuration through String circuit breaker names. In the example above, it’s UserSessionClient#validateSession(UUID)
.
By default, the circuit breakers are named after the Feign client interface name and the respective method signature. It’s generated with the Feign#configKey
method.
Due to a bug in the currently released Spring Cloud OpenFeign version the naming doesn’t work properly. Instead of the name above, it generates HardCodedTarget#validateSession(UUID)
. It’s already fixed by this commit though.
Anyway, while that’s released, we can provide a custom version of name generator that’s doing exactly that.
@Configuration public class FeignConfiguration { @Bean public Customizer<Resilience4JCircuitBreakerFactory> circuitBreakerFactoryCustomizer() { CircuitBreakerConfig cbConfig = CircuitBreakerConfig.custom() .slidingWindowType(CircuitBreakerConfig.SlidingWindowType.COUNT_BASED) .slidingWindowSize(5) .failureRateThreshold(20.0f) .waitDurationInOpenState(Duration.ofSeconds(5)) .permittedNumberOfCallsInHalfOpenState(5) .build(); return resilience4JCircuitBreakerFactory -> resilience4JCircuitBreakerFactory.configure(builder -> builder.circuitBreakerConfig(cbConfig), "UserSessionClient#validateSession(UUID)"); } @Bean public CircuitBreakerNameResolver circuitBreakerNameResolver() { return (feignClientName, target, method) -> Feign.configKey(target.type(), method); } }
After this, the proper name will be generated.
Integration testing time
Now, if you start up your app and did everything right, after trying to invoke the API 5 times and having at least 1 failure, the following message will look back at you:
No fallback available. org.springframework.cloud.client.circuitbreaker.NoFallbackAvailableException: No fallback available. at app//org.springframework.cloud.client.circuitbreaker.CircuitBreaker.lambda$run$0(CircuitBreaker.java:31) at app//io.vavr.control.Try.lambda$recover$6ea7267f$1(Try.java:949) at app//io.vavr.control.Try.of(Try.java:75) at app//io.vavr.control.Try.recover(Try.java:949) at app//org.springframework.cloud.circuitbreaker.resilience4j.Resilience4JCircuitBreaker.run(Resilience4JCircuitBreaker.java:123) ... at app//worker.org.gradle.process.internal.worker.GradleWorkerMain.main(GradleWorkerMain.java:74) Caused by: io.github.resilience4j.circuitbreaker.CallNotPermittedException: CircuitBreaker 'UserSessionClient#validateSession(UUID)' is OPEN and does not permit further calls at app//io.github.resilience4j.circuitbreaker.CallNotPermittedException.createCallNotPermittedException(CallNotPermittedException.java:48) at app//io.github.resilience4j.circuitbreaker.internal.CircuitBreakerStateMachine$OpenState.acquirePermission(CircuitBreakerStateMachine.java:689) at app//io.github.resilience4j.circuitbreaker.internal.CircuitBreakerStateMachine.acquirePermission(CircuitBreakerStateMachine.java:206) at app//io.github.resilience4j.circuitbreaker.CircuitBreaker.lambda$decorateCallable$3(CircuitBreaker.java:168) at app//io.vavr.control.Try.of(Try.java:75) ... 89 more
Awesome. Now we’re talking.
Let’s write an integration test for this use-case to validate that the circuit breaker is working as we want it to work.
I’ll reuse the TestServiceInstaceListSupplier from my other article. If you haven’t read it, check it out.
Let’s start simple:
@SpringBootTest({"server.port:0", "eureka.client.enabled:false"}) public class CircuitBreakerTest { @TestConfiguration public static class TestConfig { @Bean public ServiceInstanceListSupplier serviceInstanceListSupplier() { return new TestServiceInstanceListSupplier("user-session-service", 8082); } } @RegisterExtension static WireMockExtension USER_SESSION_SERVICE = WireMockExtension.newInstance() .options(WireMockConfiguration.wireMockConfig().port(8082)) .build(); @Autowired private UserSessionClient userSessionClient; @Test public void testErrorBasedCircuitBreaking() throws Exception { String responseBody = "{ \"sessionId\": \"828bc3cb-52f0-482b-8247-d3db5c87c941\", \"valid\": true}"; String uuidString = "828bc3cb-52f0-482b-8247-d3db5c87c941"; UUID uuid = UUID.fromString(uuidString); USER_SESSION_SERVICE.stubFor(get(urlPathEqualTo("/user-sessions/validate")) .withQueryParam("sessionId", equalTo(uuidString)) .willReturn(serverError())); } }
Starting up a Spring Boot test on random port, turning off Eureka and providing the service list manually. Then starting up a WireMock server on port 8082, a simple test method with some initial data and a stubbing for the API that our Feign client will invoke.
As you can see, the mock server will respond with a serverError()
which is an HTTP 500.
Let’s write the code to invoke the API:
@SpringBootTest({"server.port:0", "eureka.client.enabled:false"}) public class CircuitBreakerTest { @TestConfiguration public static class TestConfig { @Bean public ServiceInstanceListSupplier serviceInstanceListSupplier() { return new TestServiceInstanceListSupplier("user-session-service", 8082); } } @RegisterExtension static WireMockExtension USER_SESSION_SERVICE = WireMockExtension.newInstance() .options(WireMockConfiguration.wireMockConfig().port(8082)) .build(); @Autowired private UserSessionClient userSessionClient; @Test public void testErrorBasedCircuitBreaking() throws Exception { String responseBody = "{ \"sessionId\": \"828bc3cb-52f0-482b-8247-d3db5c87c941\", \"valid\": true}"; String uuidString = "828bc3cb-52f0-482b-8247-d3db5c87c941"; UUID uuid = UUID.fromString(uuidString); USER_SESSION_SERVICE.stubFor(get(urlPathEqualTo("/user-sessions/validate")) .withQueryParam("sessionId", equalTo(uuidString)) .willReturn(serverError())); NoFallbackAvailableException noFallbackAvailableException = catchThrowableOfType(() -> userSessionClient.validateSession(uuid), NoFallbackAvailableException.class); FeignException feignException = (FeignException) noFallbackAvailableException.getCause().getCause(); assertThat(feignException.status()).isEqualTo(500); } }
Obviously since the Feign client doesn’t have any fallbacks associated with it, Spring will throw a NoFallbackAvailableException
instead which contains the regular FeignException
as a cause, from which we can check the status code of the API call. It should be an HTTP 500. Note that the cause is double wrapped so it has to be unpacked twice.
Now, out of 5 requests, we already have one failure meaning that even if the 4 remaining API calls will succeed, the circuit breaker will still open. Let’s do exactly that.
@SpringBootTest({"server.port:0", "eureka.client.enabled:false"}) public class CircuitBreakerTest { @TestConfiguration public static class TestConfig { @Bean public ServiceInstanceListSupplier serviceInstanceListSupplier() { return new TestServiceInstanceListSupplier("user-session-service", 8082); } } @RegisterExtension static WireMockExtension USER_SESSION_SERVICE = WireMockExtension.newInstance() .options(WireMockConfiguration.wireMockConfig().port(8082)) .build(); @Autowired private UserSessionClient userSessionClient; @Test public void testErrorBasedCircuitBreaking() throws Exception { String responseBody = "{ \"sessionId\": \"828bc3cb-52f0-482b-8247-d3db5c87c941\", \"valid\": true}"; String uuidString = "828bc3cb-52f0-482b-8247-d3db5c87c941"; UUID uuid = UUID.fromString(uuidString); USER_SESSION_SERVICE.stubFor(get(urlPathEqualTo("/user-sessions/validate")) .withQueryParam("sessionId", equalTo(uuidString)) .willReturn(serverError())); NoFallbackAvailableException noFallbackAvailableException = catchThrowableOfType(() -> userSessionClient.validateSession(uuid), NoFallbackAvailableException.class); FeignException feignException = (FeignException) noFallbackAvailableException.getCause().getCause(); assertThat(feignException.status()).isEqualTo(500); USER_SESSION_SERVICE.stubFor(get(urlPathEqualTo("/user-sessions/validate")) .withQueryParam("sessionId", equalTo(uuidString)) .willReturn(okJson(responseBody))); for (int i = 0; i<4; i++) { userSessionClient.validateSession(uuid); } } }
The mock server is reconfigured to respond with HTTP 200 OK and to return the proper JSON body, then the API is invoked 4 times.
We reached 5 API calls, now the circuit will open upon the next invocation.
@SpringBootTest({"server.port:0", "eureka.client.enabled:false"}) public class CircuitBreakerTest { @TestConfiguration public static class TestConfig { @Bean public ServiceInstanceListSupplier serviceInstanceListSupplier() { return new TestServiceInstanceListSupplier("user-session-service", 8082); } } @RegisterExtension static WireMockExtension USER_SESSION_SERVICE = WireMockExtension.newInstance() .options(WireMockConfiguration.wireMockConfig().port(8082)) .build(); @Autowired private UserSessionClient userSessionClient; @Test public void testErrorBasedCircuitBreaking() throws Exception { String responseBody = "{ \"sessionId\": \"828bc3cb-52f0-482b-8247-d3db5c87c941\", \"valid\": true}"; String uuidString = "828bc3cb-52f0-482b-8247-d3db5c87c941"; UUID uuid = UUID.fromString(uuidString); USER_SESSION_SERVICE.stubFor(get(urlPathEqualTo("/user-sessions/validate")) .withQueryParam("sessionId", equalTo(uuidString)) .willReturn(serverError())); NoFallbackAvailableException noFallbackAvailableException = catchThrowableOfType(() -> userSessionClient.validateSession(uuid), NoFallbackAvailableException.class); FeignException feignException = (FeignException) noFallbackAvailableException.getCause().getCause(); assertThat(feignException.status()).isEqualTo(500); USER_SESSION_SERVICE.stubFor(get(urlPathEqualTo("/user-sessions/validate")) .withQueryParam("sessionId", equalTo(uuidString)) .willReturn(okJson(responseBody))); for (int i = 0; i<4; i++) { userSessionClient.validateSession(uuid); } NoFallbackAvailableException anotherNoFallbackAvailableException = catchThrowableOfType(() -> userSessionClient.validateSession(uuid), NoFallbackAvailableException.class); CallNotPermittedException callNotPermittedException = (CallNotPermittedException) anotherNoFallbackAvailableException.getCause(); assertThat(callNotPermittedException).isNotNull(); } }
In this case, we’ll also get a NoFallbackAvailableException
from Spring although this time it will not contain a FeignException
as a cause but a CallnotPermittedException
. Note that in this case the cause is only wrapped once, so a single getCause()
method call is enough to unwrap.
We have the circuit open, let’s wait for the configured time, 5 seconds.
@SpringBootTest({"server.port:0", "eureka.client.enabled:false"}) public class CircuitBreakerTest { @TestConfiguration public static class TestConfig { @Bean public ServiceInstanceListSupplier serviceInstanceListSupplier() { return new TestServiceInstanceListSupplier("user-session-service", 8082); } } @RegisterExtension static WireMockExtension USER_SESSION_SERVICE = WireMockExtension.newInstance() .options(WireMockConfiguration.wireMockConfig().port(8082)) .build(); @Autowired private UserSessionClient userSessionClient; @Test public void testErrorBasedCircuitBreaking() throws Exception { String responseBody = "{ \"sessionId\": \"828bc3cb-52f0-482b-8247-d3db5c87c941\", \"valid\": true}"; String uuidString = "828bc3cb-52f0-482b-8247-d3db5c87c941"; UUID uuid = UUID.fromString(uuidString); USER_SESSION_SERVICE.stubFor(get(urlPathEqualTo("/user-sessions/validate")) .withQueryParam("sessionId", equalTo(uuidString)) .willReturn(serverError())); NoFallbackAvailableException noFallbackAvailableException = catchThrowableOfType(() -> userSessionClient.validateSession(uuid), NoFallbackAvailableException.class); FeignException feignException = (FeignException) noFallbackAvailableException.getCause().getCause(); assertThat(feignException.status()).isEqualTo(500); USER_SESSION_SERVICE.stubFor(get(urlPathEqualTo("/user-sessions/validate")) .withQueryParam("sessionId", equalTo(uuidString)) .willReturn(okJson(responseBody))); for (int i = 0; i<4; i++) { userSessionClient.validateSession(uuid); } NoFallbackAvailableException anotherNoFallbackAvailableException = catchThrowableOfType(() -> userSessionClient.validateSession(uuid), NoFallbackAvailableException.class); CallNotPermittedException callNotPermittedException = (CallNotPermittedException) anotherNoFallbackAvailableException.getCause(); assertThat(callNotPermittedException).isNotNull(); Thread.sleep(5100); } }
I intentionally wait for 5.1 seconds to be on the safe side. I don’t want random test failures due to some timing issues.
Then let’s call the API 6 more times to verify that the circuit breaker will not get open again after the 5 sampling window.
@SpringBootTest({"server.port:0", "eureka.client.enabled:false"}) public class CircuitBreakerTest { @TestConfiguration public static class TestConfig { @Bean public ServiceInstanceListSupplier serviceInstanceListSupplier() { return new TestServiceInstanceListSupplier("user-session-service", 8082); } } @RegisterExtension static WireMockExtension USER_SESSION_SERVICE = WireMockExtension.newInstance() .options(WireMockConfiguration.wireMockConfig().port(8082)) .build(); @Autowired private UserSessionClient userSessionClient; @Test public void testErrorBasedCircuitBreaking() throws Exception { String responseBody = "{ \"sessionId\": \"828bc3cb-52f0-482b-8247-d3db5c87c941\", \"valid\": true}"; String uuidString = "828bc3cb-52f0-482b-8247-d3db5c87c941"; UUID uuid = UUID.fromString(uuidString); USER_SESSION_SERVICE.stubFor(get(urlPathEqualTo("/user-sessions/validate")) .withQueryParam("sessionId", equalTo(uuidString)) .willReturn(serverError())); NoFallbackAvailableException noFallbackAvailableException = catchThrowableOfType(() -> userSessionClient.validateSession(uuid), NoFallbackAvailableException.class); FeignException feignException = (FeignException) noFallbackAvailableException.getCause().getCause(); assertThat(feignException.status()).isEqualTo(500); USER_SESSION_SERVICE.stubFor(get(urlPathEqualTo("/user-sessions/validate")) .withQueryParam("sessionId", equalTo(uuidString)) .willReturn(okJson(responseBody))); for (int i = 0; i<4; i++) { userSessionClient.validateSession(uuid); } NoFallbackAvailableException anotherNoFallbackAvailableException = catchThrowableOfType(() -> userSessionClient.validateSession(uuid), NoFallbackAvailableException.class); CallNotPermittedException callNotPermittedException = (CallNotPermittedException) anotherNoFallbackAvailableException.getCause(); assertThat(callNotPermittedException).isNotNull(); Thread.sleep(5100); for (int i = 0; i<6; i++) { userSessionClient.validateSession(uuid); } } }
Simply call the Feign client 6 times and if all suceeds, the circuit breaker is working properly.
Easy and effective, right? You don’t want to test these things manually because they’re complex and there’s a lot of room for human error during testing.
There’s a very similar test case on my GitHub. If you liked the article, make sure to share it with your fellow developers, and check me out on Facebook and Twitter.
Also, if you wanna know more about Feign, Spring Cloud OpenFeign, Eureka integration and Resilicence4J integration, check out my brand new course: Mastering microservice communication with Spring Cloud Feign.
Hi Arnold. I tried to reach your courses through Udemy business, but it seems they are not available. Is it intended, or do you need to do some action for it? Thanks!
Hi. Unfortunately Udemy Business comes with an exclusivity on the course which was not acceptable for me so I opted out, that’s why you can’t find it.
Send me a mail to info@138.3.242.171 and I’ll get you a huge discounted coupon. 😉
Hey Arnold,
Thanks for this article. This has been extremely helpful.
I am trying to move the Resilience4j circuit breaker configuration into property files by using :
resilience4j.circuitbreaker.instances.UserSessionClient#validateSession(UUID).slow-call-duration-threshold=2000
resilience4j.circuitbreaker.instances.UserSessionClient#validateSession(UUID).slow-call-rate-threshold=80
resilience4j.circuitbreaker.instances.UserSessionClient#validateSession(UUID).sliding-window-type=count_based
resilience4j.circuitbreaker.instances.UserSessionClient#validateSession(UUID).sliding-window-size=5
resilience4j.circuitbreaker.instances.UserSessionClient#validateSession(UUID).failure-rate-threshold=20
resilience4j.circuitbreaker.instances.UserSessionClient#validateSession(UUID).permitted-number-of-calls-in-half-open-state=5
resilience4j.circuitbreaker.instances.UserSessionClient#validateSession(UUID).wait-duration-in-open-state=5000
This however does not work? The circuit breaker seems to ignore these properties. What am i doing wrong ?
Hi Mushtaq,
Thanks. Appreciated.
If I remember correctly the reason why this won’t work is because # has a special meaning in a properties file, commenting out the content after the # and since the Feign.configKey method generates the naming like that, it won’t work.
You have 2 options. Either you stay with programmatic configuration or you supply a custom CircuitBreakerNameResolver that replaces the # with something else, for example with a -.
Hope that helps, let me know.
Hey Arnold,
Thanks for your articles.
Can you help me?
How to write tests for Feign that are marked with an annotation resilience4j @Retry?
Something like this…
@Async
@Retry(name = “testInst”, fallbackMethod = “testFallback”)
public void testApi(TestRequest testRequest) {
testFeign.testApi(testRequest);
}
Hi Aliaksei,
I don’t have a guide on that so far but let me put it into my list. 😉
Arnold
I studied your class on udemy webset.
I have a question and need your help.
I define a Feign client that uses this annotation on the method
@CircuitBreaker(name = “test”, fallbackMethod = “testFallbackMethod “)
I want to know , How do I test this.
Thanks a lot .
Hi Owen,
Well, I don’t have an article on the testing exactly but the idea is simple. You create an environment where you know your Feign client will fail and then you test against the behavior of the fallback.
For example, if the Feign client calls a server which is responding with an HTTP 500 that triggers the fallback, you setup a mock server that responds with the HTTP 500 and verify that the fallback behavior occurred.
Hope that helps.
Arnold
its not work , code can not run fallbackMethod
Hi Owen,
It’s hard to help with the amount of information you’ve given. If you have an example project on GH, I can certainly check it quickly or/and you can start a question on stackoverflow.
Best,
Arnold