Does the use of Spring Webflux's WebClient in a blocking application design cause a larger use of resources than RestTemplate
Asked Answered
M

3

27

I am working on several spring-boot applications which have the traditional pattern of thread-per-request. We are using Spring-boot-webflux to acquire WebClient to perform our RESTful integration between the applications. Hence our application design requires that we block the publisher right after receiving a response.

Recently, we've been discussing whether we are unnecessarily spending resources using a reactive module in our otherwise blocking application design. As I've understood it, WebClient makes use of the event loop by assigning a worker thread to perform the reactive actions in the event loop. So using webclient with .block() would sleep the original thread while assigning another thread to perform the http-request. Compared to the alternative RestTemplate, it seems like WebClient would spend additional resources by using the event loop.

Is it correct that partially introducing spring-webflux in this way leads to additional spent resources while not yielding any positive contribution to performance, neither single threaded and concurrent? We are not expecting to ever upgrade our current stack to be fully reactive, so the argument of gradually upgrading does not apply.

Mcknight answered 19/5, 2022 at 7:4 Comment(0)
W
17

In this presentation Rossen Stoyanchev from the Spring team explains some of these points.

WebClient will use a limited number of threads - 2 per core for a total of 12 threads on my local machine - to handle all requests and their responses in the application. So if your application receives 100 requests and makes one request to an external server for each, WebClient will handle all of those using those threads in a non-blocking / asynchronous manner.

Of course, as you mention, once you call block your original thread will block, so it would be 100 threads + 12 threads for a total of 112 threads to handle those requests. But keep in mind that these 12 threads do not grow in size as you make more requests, and that they don't do I/O heavy lifting, so it's not like WebClient is spawning threads to actually perform the requests or keeping them busy on a thread-per-request fashion.

I'm not sure if when the thread is under block it behaves the same as when making a blocking call through RestTemplate - it seems to me that in the former the thread should be inactive waiting for the NIO call to complete, while in the later the thread should be handling I/O work, so maybe there's a difference there.

It gets interesting if you begin using the reactor goodies, for example handling requests that depend on one another, or many requests in parallel. Then WebClient definitely gets an edge as it'll perform all concurrent actions using the same 12 threads, instead of using a thread per request.

As an example, consider this application:

@SpringBootApplication
public class SO72300024 {

    private static final Logger logger = LoggerFactory.getLogger(SO72300024.class);

    public static void main(String[] args) {
        SpringApplication.run(SO72300024.class, args);
    }

    @RestController
    @RequestMapping("/blocking")
    static class BlockingController {

        @GetMapping("/{id}")
        String blockingEndpoint(@PathVariable String id) throws Exception {
            logger.info("Got request for {}", id);
            Thread.sleep(1000);
            return "This is the response for " + id;
        }

        @GetMapping("/{id}/nested")
        String nestedBlockingEndpoint(@PathVariable String id) throws Exception {
            logger.info("Got nested request for {}", id);
            Thread.sleep(1000);
            return "This is the nested response for " + id;
        }

    }

    @Bean
    ApplicationRunner run() {
        return args -> {
            Flux.just(callApi(), callApi(), callApi())
                    .flatMap(responseMono -> responseMono)
                    .collectList()
                    .block()
                    .stream()
                    .flatMap(Collection::stream)
                    .forEach(logger::info);
            logger.info("Finished");
        };
    }

    private Mono<List<String>> callApi() {
        WebClient webClient = WebClient.create("http://localhost:8080");
        logger.info("Starting");
        return Flux.range(1, 10).flatMap(i ->
                        webClient
                                .get().uri("/blocking/{id}", i)
                                .retrieve()
                                .bodyToMono(String.class)
                                .doOnNext(resp -> logger.info("Received response {} - {}", I, resp))
                                .flatMap(resp -> webClient.get().uri("/blocking/{id}/nested", i)
                                        .retrieve()
                                        .bodyToMono(String.class)
                                        .doOnNext(nestedResp -> logger.info("Received nested response {} - {}", I, nestedResp))))
                .collectList();
    }
}

If you run this app, you can see that all 30 requests are handled immediately in parallel by the same 12 (in my computer) threads. Neat! If you think you can benefit from such kind of parallelism in your logic, it's probably worth it giving WebClient a shot.

If not, while I wouldn't actually worry about the "extra resource spending" given the reasons above, I don't think it would be worth it adding the whole reactor/webflux dependency for this - besides the extra baggage, in day to day operations it should be a lot simpler to reason about and debug RestTemplate and the thread-per-request model.

Of course, as others have mentioned, you ought to run load tests to have proper metrics.

Witkowski answered 19/5, 2022 at 23:4 Comment(4)
Great answer, thank you! We are currently running WebClient, so the consideration is whether to move towards RestTemplate or perhaps the jdk's HttpClient to reduce thread consumption and abandon the need for webflux. Reading your answer, I believe that the cost of changing that tool outweighs the increased performance. The addition of gaining an edge when performing concurrent actions is a good one, and I'll keep that in mind in further development.Mcknight
Not sure this is a correct comparison. Reactive API is very flexible and allow you to control the execution flow. By default flatMap is running Queues.SMALL_BUFFER_SIZE in parallel but you could control concurrency or use other operators like concatMap that is processing data sequentially. You could also control Schedulers (Thread pool to run). In any case the advantage of the WebClient is not reactive API but NiO client (by default Netty) where all IO operations are asynchronous and non blocking. You could find many benchmarks for Netty proving it's performance.Madeira
Hi Alex, thanks for your comment. While you brought good points, I’m not sure I understand why that would make it an incorrect comparison. What you mention is nice if you’re interested in dealing with parallelism, thread pools and such - otherwise I don’t really see the overall benefit for a blocking app besides what we’ve already covered. Care to elaborate a little perhaps? Thanks!Witkowski
@Tomaz Fernandes Updated my answer to provide more detailsMadeira
M
5

According to official Spring documentation for RestTemplate it's in the maintenance mode and probably will not be supported in future versions.

As of 5.0 this class is in maintenance mode, with only minor requests for changes and bugs to be accepted going forward. Please, consider using the org.springframework.web.reactive.client.WebClient which has a more modern API and supports sync, async, and streaming scenarios

As for system resources, it really depends on your use case and I would recommend to run some performance tests, but it seems that for low workloads using blocking client could have a better performance owning to a dedicated thread per connection. As load increases, the NIO clients tend to perform better.

Update - Reactive API vs Http Client

It's important to understand the difference between Reactive API (Project Reactor) and http client. Although WebClient uses Reactive API it doesn't add any additional concurrently until we explicitly use operators like flatMap or delay that could schedule execution on different thread pools. If we just use

webClient
  .get()
  .uri("<endpoint>")
  .retrieve()
  .bodyToMono(String.class)
  .block()

the code will be executed on the caller thread that is the same as for blocking client.

If we enable debug logging for this code, we will see that WebClient code is executed on the caller thread but for network operations execution will be switched to reactor-http-nio-... thread.

The main difference is that internally WebClient uses asynchronous client based on non-blocking IO (NIO). These clients use Reactor pattern (event loop) to maintain a separate thread pool(s) which allow you to handle a large number of concurrent connections.

The purpose of I/O reactors is to react to I/O events and to dispatch event notifications to individual I/O sessions. The main idea of I/O reactor pattern is to break away from the one thread per connection model imposed by the classic blocking I/O model.

By default, Reactor Netty is used but you could consider Jetty Rective Http Client, Apache HttpComponents (async) or even AWS Common Runtime (CRT) Http Client if you create required adapter (not sure it already exists).

In general, you can see the trend across the industry to use async I/O (NIO) because they are more resource efficient for applications under high load.

In addition, to handle resource efficiently the whole flow must be async. By using block() we are implicitly reintroducing thread-per-connection approach that will eliminate most of the benefits of the NIO. At the same time using WebClient with block() could be considered as a first step for migration to the fully reactive application.

Madeira answered 19/5, 2022 at 13:30 Comment(5)
Thank you for the answer! We're aware of that RestTemplate has gone to maintenance mode. However, that's the status which was set after it was initially set to deprecated. The withdrawal of that status is to me a signal that the RestTemplate will probably not go away anytime soon. Doing some performance tests is a very good idea, and I should look into that before making any changes.Mcknight
Thanks for the update Alex. AFAIK flatMap runs on the thread it was called on unless you use publishOn or subscribeOn operators. That's part of the concurrency agnostic paradigm project Reactor abides to. So everything after the retrieve() call should run on a NIO thread until block() merges it back to the main thread - of course unless some other NIO operation is used in which case the callback might be handled by a different NIO thread instead.Witkowski
You are absolutely right that publishOn, subscribeOn could be used to control schedulers but the main point here was that WebClient would not introduce new threads (except NIO thread pool) until you use specific operators like flatMap that allow to run multiple flows in parallelMadeira
@Madeira regarding your statement here: If we enable debug logging for this code, we will see that WebClient code is executed on the caller thread but for network operations execution will be switched to reactor-http-nio-... thread., your example used .block(). If I also use a WebClient ExchangeFilterFunction to intercept the request, will the code in the filter function also be run on the main thread, or will it be on a different thread?Nipissing
Request filter will be executed on the caller thread and response filter on the http client thread reactor-http-nio-...Madeira
P
1

Great question.

Last week we considered migrating from resttemplate to webclient. This week, I start testing the performance between the blocking webclient and the resttemplate, to my surprise, the resttemplate performed better in scenarios where the response payloads were large. The difference was considerably large, with the resttemplate taking less than half the time to respond and using fewer resources.

I'm still carrying out the performance tests, now I started the tests with a wider range of users for request.

The application is mvc and is using spring 5.13.19 and spring boot 2.6.7.

For perfomance testing I'm using jmeter and for health check visualvm/jconsole

Phratry answered 19/5, 2022 at 21:30 Comment(1)
Thank you fot the answer! That is very interesting. Depending on the implementation, it makes sense that the process time of the webclient is higher due to the use of the event loop rather than performing the actions synchronously. I would, however, not guess that the performance difference would be subject to payload size. I'm interested to see further results of your tests.Mcknight

© 2022 - 2024 — McMap. All rights reserved.