Spring Cloud Feign traffic cut-off with Resilience4J TimeLimiter

Most probably you’ve at least heard about circuit breaking if you’ve dealt with microservices.

HTTP is still a common protocol to choose when setting up the communication channel between your microservices and while it’s awesome, there are definitely some downsides.

For example since it’s a synchronous request/response based protocol, after sending a request to the other end, you gotta wait for the reply to come back. And sometimes that could take an awful amount of time.

Let’s take an example architecture:

The online-store-service exposes APIs to the public world like creating and buying products. For every request that comes in, the user’s session has to be validated. The validation happens using the user-session-service. It has a single API that will respond whether the given session is valid or not.

What if the user-session-service gets overloaded with tons of requests and unable to cope with the load in a reasonable time?

That slowness will cascade to the exposed APIs provided by the online-store-service. And what if there’s an SLA for those public APIs, like responding within 1 second, let it be a positive response or an error one.

We’ll check how to make sure that a Feign client call gets cut off after a specified amount of time.

The Feign client

Let’s create a simple Feign client for the use-case above:

@FeignClient(name = "user-session-service")
public interface UserSessionClient {
    @GetMapping("/user-sessions/validate")
    UserSessionValidatorResponse validateSession(@RequestParam UUID sessionId);
}

Nothing special, one API with one query parameter, sessionId.

Let’s say, the online-store-service exposes a POST /products API that triggers the session validation. After the session is validated and decided to be valid, it’s going to forward the request to the inventory-service to create the actual product.

If the Feign API call for session validation takes X seconds. And the product creation with the inventory-service takes Y seconds. The overall response time will be X + Y + Z seconds where Z stands for the extra time taken by the online-store-service.

A very quick example on this. If the session validation takes 8 seconds and the product creation in the inventory-service takes 50 ms and the extra added time by the online-store-service is 10 ms, then the overall response time will be 8060 ms, a tiny bit more than 8 seconds.

And whatever happens with the session validation in terms of response time, it’s going to cascade into the online-store-service‘s response time even though the online-store-service has an SLA of let’s say 5 seconds. Not good.

To ensure we’re meeting the specified response time – and don’t kill me because I said 5 seconds, I know it’s a lot but honestly a lot of APIs can’t even meet that – we’ll use Resilience4J for the job.

Configuring a Resilience4J TimeLimiter

We’ll use a TimeLimiter from the Resilience4J portfolio.

Let’s add the Resilience4J package to the build.gradle/pom.xml:

// omitted

ext {
	set('springCloudVersion', "2020.0.4")
}

dependencies {
	implementation 'org.springframework.cloud:spring-cloud-starter-openfeign'
	implementation 'org.springframework.cloud:spring-cloud-starter-circuitbreaker-resilience4j'
        // omitted for simplicity
}

dependencyManagement {
	imports {
		mavenBom "org.springframework.cloud:spring-cloud-dependencies:${springCloudVersion}"
	}
}

// omitted

Then, let’s go to the application.properties of the app and enable Feign circuit breaking (although we’ll not use circuit breaking here):

feign.circuitbreaker.enabled=true

When you enable the circuit breaker with Resilience4J there will be a default TimeLimiter configured which I’ll explain in a second, but before doing that, let’s talk a second about the available parameters for a TimeLimiter.

There’s the most critical parameter, timeoutDuration. It’s configuring how much time you want to give to your client invocations. If it’s set to 1 second, then that’s gonna be the available time window for execution. After 1 second, it’ll cut the traffic off.

The other one is the cancelRunningFuture parameter. It controls whether upon a timeout, the timed out execution shall be cancelled or not.

Back to the default TimeLimiter. It’s configured with a 1 second timeoutDuration and true for cancelRunningFuture.

Let’s create a new @Configuration class and we’ll use a Customizer.

@Configuration
public class FeignConfiguration implements FeignFormatterRegistrar {
    @Bean
    public Customizer<Resilience4JCircuitBreakerFactory> circuitBreakerFactoryCustomizer() {
        
    }
}

Then let’s go ahead and get a custom TimeLimiter config.

@Configuration
public class FeignConfiguration {
    @Bean
    public Customizer<Resilience4JCircuitBreakerFactory> circuitBreakerFactoryCustomizer() {
        TimeLimiterConfig timeLimiterConfig = TimeLimiterConfig.custom().timeoutDuration(Duration.ofSeconds(5)).build();
        return resilience4JCircuitBreakerFactory -> resilience4JCircuitBreakerFactory.configure(builder ->
                builder.timeLimiterConfig(timeLimiterConfig), "UserSessionClient#validateSession(UUID)");
    }
}

This config will cut off the traffic after 5 seconds.

Also, let’s talk a second about the configuration for the specific Feign client using the textual name of the client, UserSessionClient#validateSession(UUID).

By default, they are named after the Feign client interface name and the respective method signature. It’s generated with the Feign#configKey method.

Due to a bug in the currently released Spring Cloud OpenFeign version the naming doesn’t work properly. Instead of the name above, it generates HardCodedTarget#validateSession(UUID). It’s already fixed by this commit though.

Anyway, while that’s released, we can provide a custom version of name generator that’s doing exactly that.

@Configuration
public class FeignConfiguration {
    @Bean
    public Customizer<Resilience4JCircuitBreakerFactory> circuitBreakerFactoryCustomizer() {
        TimeLimiterConfig timeLimiterConfig = TimeLimiterConfig.custom().timeoutDuration(Duration.ofSeconds(5)).build();
        return resilience4JCircuitBreakerFactory -> resilience4JCircuitBreakerFactory.configure(builder ->
                builder.timeLimiterConfig(timeLimiterConfig), "UserSessionClient#validateSession(UUID)");
    }

    @Bean
    public CircuitBreakerNameResolver circuitBreakerNameResolver() {
        return (feignClientName, target, method) -> Feign.configKey(target.type(), method);
    }
}

Testing the Resilicence4J TimeLimiter with the Feign client

Now, if we invoke the Feign client and the execution time exceeds the 5 seconds, we’ll get a similar exception message.

No fallback available.
org.springframework.cloud.client.circuitbreaker.NoFallbackAvailableException: No fallback available.
	at app//org.springframework.cloud.client.circuitbreaker.CircuitBreaker.lambda$run$0(CircuitBreaker.java:31)
	at app//io.vavr.control.Try.lambda$recover$6ea7267f$1(Try.java:949)
	at app//io.vavr.control.Try.of(Try.java:75)
	at app//io.vavr.control.Try.recover(Try.java:949)
	...
	at app//worker.org.gradle.process.internal.worker.GradleWorkerMain.main(GradleWorkerMain.java:74)
Caused by: java.util.concurrent.TimeoutException: TimeLimiter 'UserSessionClient#validateSession(UUID)' recorded a timeout exception.
	at java.base/java.util.concurrent.FutureTask.get(FutureTask.java:204)
	at io.github.resilience4j.timelimiter.internal.TimeLimiterImpl.lambda$decorateFutureSupplier$0(TimeLimiterImpl.java:46)
	at io.github.resilience4j.circuitbreaker.CircuitBreaker.lambda$decorateCallable$3(CircuitBreaker.java:171)
	at io.vavr.control.Try.of(Try.java:75)
	... 89 more

Good, now let’s create an integration test for this. We want this automated. We don’t want to test every timing scenario manually, it’s just unreliable.

I’ll reuse the TestServiceInstaceListSupplier from my other article. If you haven’t read it, check it out.

Let’s start with a simple test:

@SpringBootTest({"server.port:0", "eureka.client.enabled:false"})
public class CircuitBreakerTest {
    @TestConfiguration
    public static class TestConfig {
        @Bean
        public ServiceInstanceListSupplier serviceInstanceListSupplier() {
            return new TestServiceInstanceListSupplier("user-session-service", 8082);
        }
    }

    @RegisterExtension
    static WireMockExtension USER_SESSION_SERVICE = WireMockExtension.newInstance()
            .options(WireMockConfiguration.wireMockConfig().port(8082))
            .build();

    @Autowired
    private UserSessionClient userSessionClient;

    @Test
    public void testTimeLimiterWorks() throws Exception {
        String responseBody = "{ \"sessionId\": \"828bc3cb-52f0-482b-8247-d3db5c87c941\", \"valid\": true}";

        String uuidString = "828bc3cb-52f0-482b-8247-d3db5c87c941";
        UUID uuid = UUID.fromString(uuidString);

        USER_SESSION_SERVICE.stubFor(get(urlPathEqualTo("/user-sessions/validate"))
                .withQueryParam("sessionId", equalTo(uuidString))
                .willReturn(aResponse().withBody(responseBody).withHeader(CONTENT_TYPE, APPLICATION_JSON_VALUE).withFixedDelay(7000)));
    }

}

First let’s create a Spring Boot test, starting up on random port and disabling Eureka. Also, start up a WireMock server on port 8082.

Then in the test case, let’s do the stubbing. The important part is to specify the response to be returned with a fixed delay of 7 seconds, using the withFixedDelay method.

After that, let’s call the Feign client.

@SpringBootTest({"server.port:0", "eureka.client.enabled:false"})
public class CircuitBreakerTest {
    @TestConfiguration
    public static class TestConfig {
        @Bean
        public ServiceInstanceListSupplier serviceInstanceListSupplier() {
            return new TestServiceInstanceListSupplier("user-session-service", 8082);
        }
    }

    @RegisterExtension
    static WireMockExtension USER_SESSION_SERVICE = WireMockExtension.newInstance()
            .options(WireMockConfiguration.wireMockConfig().port(8082))
            .build();

    @Autowired
    private UserSessionClient userSessionClient;

    @Test
    public void testTimeLimiterWorks() throws Exception {
        String responseBody = "{ \"sessionId\": \"828bc3cb-52f0-482b-8247-d3db5c87c941\", \"valid\": true}";

        String uuidString = "828bc3cb-52f0-482b-8247-d3db5c87c941";
        UUID uuid = UUID.fromString(uuidString);

        USER_SESSION_SERVICE.stubFor(get(urlPathEqualTo("/user-sessions/validate"))
                .withQueryParam("sessionId", equalTo(uuidString))
                .willReturn(aResponse().withBody(responseBody).withHeader(CONTENT_TYPE, APPLICATION_JSON_VALUE).withFixedDelay(7000)));

        NoFallbackAvailableException noFallbackAvailableException = catchThrowableOfType(() -> userSessionClient.validateSession(uuid), NoFallbackAvailableException.class);
        TimeoutException timeoutException = (TimeoutException) noFallbackAvailableException.getCause();
        assertThat(timeoutException).isNotNull();

    }

}

The Feign client is called and the TimeoutException is unwrapped and verified. Awesome. At least we know the call failed with a timeout.

This is almost complete but we can do better. If the configuration is set to 5 seconds for timeout, then why don’t we validate how much time the execution took, approximately.

I say approximately since there’s a lot more to a Feign client invocation than just sending the HTTP request and waiting for the response so there’s some additional overhead time, especially for the first Feign call (there is some lazy initialization to it).

Let’s wrap the call around with a StopWatch and let’s validate the time the execution took.

@SpringBootTest({"server.port:0", "eureka.client.enabled:false"})
public class CircuitBreakerTest {
    @TestConfiguration
    public static class TestConfig {
        @Bean
        public ServiceInstanceListSupplier serviceInstanceListSupplier() {
            return new TestServiceInstanceListSupplier("user-session-service", 8082);
        }
    }

    @RegisterExtension
    static WireMockExtension USER_SESSION_SERVICE = WireMockExtension.newInstance()
            .options(WireMockConfiguration.wireMockConfig().port(8082))
            .build();

    @Autowired
    private UserSessionClient userSessionClient;

    @Test
    public void testTimeLimiterWorks() throws Exception {
        String responseBody = "{ \"sessionId\": \"828bc3cb-52f0-482b-8247-d3db5c87c941\", \"valid\": true}";

        String uuidString = "828bc3cb-52f0-482b-8247-d3db5c87c941";
        UUID uuid = UUID.fromString(uuidString);

        USER_SESSION_SERVICE.stubFor(get(urlPathEqualTo("/user-sessions/validate"))
                .withQueryParam("sessionId", equalTo(uuidString))
                .willReturn(aResponse().withBody(responseBody).withHeader(CONTENT_TYPE, APPLICATION_JSON_VALUE).withFixedDelay(7000)));

        StopWatch stopWatch = new StopWatch();
        stopWatch.start();
        NoFallbackAvailableException noFallbackAvailableException = catchThrowableOfType(() -> userSessionClient.validateSession(uuid), NoFallbackAvailableException.class);
        stopWatch.stop();
        TimeoutException timeoutException = (TimeoutException) noFallbackAvailableException.getCause();
        assertThat(timeoutException).isNotNull();
        assertThat(stopWatch.getTotalTimeMillis()).isLessThan(5500);
    }

}

I intentionally added a 500 ms extra to the verification since I don’t want random test failures. On my computer, it takes around 5100-5150 ms for the execution and just to be on the safe side, I go with 5500 ms. You can be more strict though.

That’s it. Not complicated at all however extremely useful.

On my GitHub, you can find a similar test case which I created for my Mastering microservice communication with Spring Cloud Feign course. Check it out if you’re interested in building a wider knowledge on Feign.

If you liked the article, don’t forget to share it with your colleagues, friends and make sure to follow me on Facebook and Twitter.

One Reply to “Spring Cloud Feign traffic cut-off with Resilience4J TimeLimiter”

Mandar says:

3 years ago

Is it possible to integrate Feign Retry with TimeLimiter ?
I spent quite a bit of time identifying if there is any way to provide retry capability in case of a temporary timeout exception from the target microservice. In such scenario, if Feign client can retry ‘n’ times prior to ultimately propagating the exception up the call stack.