Feign client Rate Limiting using Resilience4J

Today I’m gonna talk about what is a Resilience4J Rate Limiter and how you can combine it with Feign.

Let’s talk about a use-case where a rate limiter comes tremendously handy. Have you used AWS APIs? Have you used managed service APIs? Then you’ve probably seen service limits in their documentation saying “This API can be called 5 times in a second”.

If you exceed the limit, the request could get rejected and if you continuously do this, you might even get banned from using the service.

This is where a Resilience4J’s Rate Limiter can help. You can control how much throughput you want to allow in your application for a service. The Rate Limiter in Resilience4J is a generic solution and can be used for different problems. For this article I’ll use it for a Feign client to simulate calling an external API and controlling the throughput for that particular API.

Resilience4J Rate Limiter

There are 3 attributes you gotta be aware when using a Resilience4J RateLimiter:

limitForPeriod
limitRefreshPeriod
timeoutDuration

I’ll explain all of them in a sec but let me show you a pic how it works:

There are so called cycles which are essentially a period of time, say 1 second. Every cycle has a number of free slots to be taken (permissions), let’s say there’s only 1 free slot each cycle.

Now, looking at the pic. Thread#1 comes in and acquires the 1 free slot in the second cycle. In the next cycle, there’s a free slot again – we’re at cycle 3. In cycle 4, Thread#1 again takes the single free slot so there are no more slots available in that cycle.

But then, Thread#2 comes in the same cycle and tries to take a free slot too but it won’t be allowed since there are no available slots in that cycle. So the Rate Limiter simply parks that thread until there’s a free slot available again – i.e. until the next cycle.

And this goes on and on and on. That’s how it controls the throughput. Now, the configuration parameters.

The limitRefreshPeriod configures the cycle period – i.e. how long a cycle is. The limitForPeriod configures how many free slots there are in a single cycle or if you wish to use the Resilience4J terminology; permissions.

And the last parameter is timeoutDuration. It controls how much we are willing to wait for a free slot/permission. So imagine you want to achieve a 10 request/sec rate limiting for an API and there’s an 11th request to be sent. The timeoutDuration parameter controls how long you’re willing to wait for the 11th request to acquire a free slot.

Example project

I’m gonna use a very simple example project to demonstrate how a RateLimiter works and how you can use it with Feign.

Let’s generate a project on start.spring.io with OpenFeign, Actuator, Lombok. I used Gradle with Java 11.

Open the project and create a simple Feign client:

@FeignClient(name = "user-session-service", url = "http://localhost:8081")
public interface UserSessionClient {
    @GetMapping("/user-sessions/validate")
    UserSessionValidationResponse validateSession(@RequestParam UUID sessionId);
}

This client will point to the server at localhost:8081 and will have the /user-sessions/validate endpoint.

The UserSessionValidationResponse is a simple POJO:

@Data
@NoArgsConstructor
@AllArgsConstructor
public class UserSessionValidationResponse {
    private boolean valid;
    private String sessionId;
}

And open the Application class and mark it with the @EnableFeignClients annotation.

@SpringBootApplication
@EnableFeignClients
public class RatelimiterFeignApplication {
	public static void main(String[] args) {
		SpringApplication.run(RatelimiterFeignApplication.class, args);
	}
}

That’s the basic project.

Rate Limiter in action

Next up, we need to add Resilience4J to the project. Open the build.gradle/pom.xml and add the following dependencies:

dependencies {
        // ..
	implementation 'org.springframework.boot:spring-boot-starter-aop'
	implementation 'io.github.resilience4j:resilience4j-spring-boot2:1.7.1'
	implementation 'io.github.resilience4j:resilience4j-spring:1.7.1'
	testImplementation "com.github.tomakehurst:wiremock-jre8:2.31.0"
        // ...
}

Let’s go back to the Feign client we created and add the @RateLimiter annotation:

@FeignClient(name = "user-session-service", url = "http://localhost:8081")
public interface UserSessionClient {
    @GetMapping("/user-sessions/validate")
    @RateLimiter(name = "validateSession")
    UserSessionValidationResponse validateSession(@RequestParam UUID sessionId);
}

The name of the rate limiter will be validateSession and that’s important cause that’s what we’ll use for the configuration.

Open the application.properties file and add the following config:

resilience4j.ratelimiter.instances.validateSession.limitForPeriod=1
resilience4j.ratelimiter.instances.validateSession.limitRefreshPeriod=1s
resilience4j.ratelimiter.instances.validateSession.timeoutDuration=1s

This will set the rate limiter to allow 1 request/sec with a 1 second timeout period.

That’s all, let’s test it with a mock server.

@SpringBootTest({"server.port:0"})
class RateLimiterTest {
    @RegisterExtension
    static WireMockExtension USER_SESSION_SERVICE = WireMockExtension.newInstance()
            .options(WireMockConfiguration.wireMockConfig().port(8081))
            .build();

    @Autowired
    private UserSessionClient userSessionClient;

    @Test
    public void testRateLimiterWorks() throws Exception {
        String responseBody = "{ \"sessionId\": \"828bc3cb-52f0-482b-8247-d3db5c87c941\", \"valid\": true}";

        String uuidString = "828bc3cb-52f0-482b-8247-d3db5c87c941";
        UUID uuid = UUID.fromString(uuidString);

        USER_SESSION_SERVICE.stubFor(get(urlPathEqualTo("/user-sessions/validate"))
                .withQueryParam("sessionId", equalTo(uuidString))
                .willReturn(aResponse().withBody(responseBody).withHeader(CONTENT_TYPE, APPLICATION_JSON_VALUE).withFixedDelay(500)));
    }
}

The test case creates a new WireMock server at port 8081 and sets up the /user-sessions/validate URL to respond with an example JSON and takes 500 ms to respond.

Next up, we need to simulate multiple requests occuring at the same time to see the rate limiter in action.

@SpringBootTest({"server.port:0"})
class RateLimiterTest {
    @RegisterExtension
    static WireMockExtension USER_SESSION_SERVICE = WireMockExtension.newInstance()
            .options(WireMockConfiguration.wireMockConfig().port(8081))
            .build();

    @Autowired
    private UserSessionClient userSessionClient;

    @Test
    public void testRateLimiterWorks() throws Exception {
        String responseBody = "{ \"sessionId\": \"828bc3cb-52f0-482b-8247-d3db5c87c941\", \"valid\": true}";

        String uuidString = "828bc3cb-52f0-482b-8247-d3db5c87c941";
        UUID uuid = UUID.fromString(uuidString);

        USER_SESSION_SERVICE.stubFor(get(urlPathEqualTo("/user-sessions/validate"))
                .withQueryParam("sessionId", equalTo(uuidString))
                .willReturn(aResponse().withBody(responseBody).withHeader(CONTENT_TYPE, APPLICATION_JSON_VALUE).withFixedDelay(500)));

        ExecutorService executorService = Executors.newFixedThreadPool(10);

        List<Future<?>> futures = new ArrayList<>();
        for (int i = 0; i < 5; i++) {
            Future<?> f = executorService.submit(() -> userSessionClient.validateSession(uuid));
            futures.add(f);
        }
        futures.forEach(f -> {
            try {
                f.get();
            } catch (Exception e) {
                throw new RuntimeException(e);
            }
        });
    }
}

This will create a 10 sized thread pool and will start 5 simultaneous requests against the pre-configured API and waits for them to finish.

Start the test and it will fail because we exceeded the 1 request/sec threshold.

java.util.concurrent.ExecutionException: io.github.resilience4j.ratelimiter.RequestNotPermitted: RateLimiter 'validateSession' does not permit further calls

That’s how easy it is to integrate a Resilience4J Rate Limiter with your Feign client. Let’s play around with the Rate Limiter config.

resilience4j.ratelimiter.instances.validateSession.limitForPeriod=5
resilience4j.ratelimiter.instances.validateSession.limitRefreshPeriod=1s
resilience4j.ratelimiter.instances.validateSession.timeoutDuration=0

The timeoutDuration is set to 0, meaning that we don’t want to wait for a slot to become available but we allow 5request/sec throughput.

The test will pass now as we tried to execute 5 requests in a single second. Good. If we start to increase the number of executed requests to 6, it’ll fail again:

        for (int i = 0; i < 6; i++) {
            Future<?> f = executorService.submit(() -> userSessionClient.validateSession(uuid));
            futures.add(f);
        }

Awesome.

You can further improve a rate limited API by using a Resilience4J Retry so in case the rate limiter gets overloaded with requests for a longer period of time, you can still retry the requests with certain approaches (exponential wait time, fixed wait time, etc).

Summary

Rate Limiting could be critical for your Feign clients or as a matter of fact for any external HTTP client where you have a certain service limit on the other side. Also, it could be really handy even internally to make sure you’re not overloading a downstream service unnecessarily.

Resilience4J with Feign opens a whole lot of possibilities for service resiliency. The full example is available on GitHub.

Follow me on Twitter and Facebook if you liked the article. Cheers.