L

What I am trying to achieve

We have a REST API built with Spring Boot, JPA and Hibernate. The clients using the API has an unreliable access to network. To avoid having too many errors for the end user, we made the client retry unsuccessful requests (eg. after a timeout occurs).

As we cannot be sure that the request has not already been processed by the server when sending it again, we need to make the POST requests idempotent. That is, sending twice the same POST request must not create the same resource twice.

What I have done so far

To achieve this, here is what I did:

The client is sending a UUID along with the request, in a custom HTTP header.
When the client resends the same request, the same UUID is sent.
The first time the server processes the request, the response for the request is stored in a database, along with the UUID.
The second time the same request is received, the result is retrieved from the database and the response is made without processing the request again.

So far so good.

The issue

I have multiple instances of the server working on the same database, and requests are load balanced. As a result, any instance can process the requests.

With my current implementation, the following scenario can occur:

The request is processed by instance 1 and takes a long time
Because it takes too long, the client aborts the connection and resends the same request
The 2nd request is processed by instance 2
The 1st request processing finishes, and the result is saved in database by instance 1
The 2nd request processing finishes. When instance 2 tries to store the result in the database, the result already exists in database.

In this scenario, the request has been processed twice, which is what I want to avoid.

I thought of two possible solutions:

Rollback the request 2 when a result for the same request has already been stored, and sending the saved response to the client.
Prevent the request 2 to be processed by saving the request id in the database as soon as instance 1 starts processing it. This solution wouldn't work as the connection between the client and instance 1 is closed by the timeout, making it impossible for the client to actually receive the response processed by instance 1.

Attempt on solution 1

I'm using a Filter to retrieve and store a response. My filter looks roughly like this:

@Component
public class IdempotentRequestFilter implements Filter {

    @Override
    public void doFilter(ServletRequest servletRequest, ServletResponse servletResponse, FilterChain filterChain) throws IOException, ServletException     {

        String requestId = getRequestId(request);


        if(requestId != null) { 

            ResponseCache existingResponse = getExistingResponse(requestId);

            if(existingResponse != null) {
                serveExistingResponse(response, existingResponse);
            }
            else {

                filterChain.doFilter(request, response);

                try {
                    saveResponse(requestId, response);
                    serve(response);
                }
                catch (DataIntegrityViolationException e) {

                    // Here perform rollback somehow

                    existingResponse = getExistingResponse(requestId);
                    serveExistingResponse(response, existingResponse);
                }
            }
        }
        else {
            filterChain.doFilter(request, response);
        }

    }

    ...

My requests are then processed like this:

@Controller 
public class UserController {

    @Autowired 
    UserManager userManager; 

    @RequestMapping(value = "/user", method = RequestMethod.POST)
    @ResponseBody
    public User createUser(@RequestBody User newUser)  {
        return userManager.create(newUser);
    }
}

@Component
@Lazy
public class UserManager {

    @Transactional("transactionManager")
    public User create(User user) {
        userRepository.save(user); 
        return user; 
    }

}

Questions

Can you think of any other solution to avoid the issue?
Is there any other solution to make POST requests idempotent (entirely different perhaps)?
How can I start a transaction, commit it or rollback it from the Filter shown above? Is it a good practice?
When processing the requests, the existing code already create transactions by calling multiple methods annotated with @Transactional("transactionManager"). What will happen when I start or rollback a transaction with the filter?

Note: I am rather new to spring, hibernate and JPA, and I have a limited understanding of the mechanism behind transactions and filters.

Lesbos answered 26/7, 2017 at 13:14 Comment(1)

I have just posted an opinionated answer, perhaps it will be of help to someone. Though seeing as this is more than 3 years old - have you found a way forward? If so, would you care to share it? Finally, the first two questions that you asked are generic software engineering questions (might even find better luck on softwareengineering.stackexchange.com) while the other two are Spring-related. As per SO's guidelines I'd consider this too broad, but since people voted on it, I allowed myself to (somewhat) answer the first two while ignoring the Java questions (not my forte). Cheers! – Bedplate 30/12, 2020 at 17:7

B

0

Based on

To avoid having too many errors for the end user, we made the client retry unsuccessful requests

you seem to have full control of the client code (great!) as well as the server.

It is, however, not clear whether the problem with the client's network is flakiness (the connection often randomly drops and requests are aborted) or slowness (timeouts), since you've mentioned both. So let's analyse both!

Timeouts

The first things that I'd recommend are:

adjusting the connection timeout on the server so that it is not closed before the server finishes the operation;
adjusting the request timeout on the client to account for the slow operation on the server and the slowness of the client's network.

However:

if the server operation is really slow and the maximum connection timeout (120s, is it?) is not enough;
or if you are also sending large requests/responses and the maximum client timeout is not enough;
or if you just don't want to increase the timeouts for any reason,

then the standard request-response scheme is probably not suitable.

In this case, instead of having the client wait for a response you could perhaps send back an immediate acknowledgement Request received and send the actual response via some TCP socket? Any following attempts would receive either a message saying that the Request is being processed, or the final response, if the operation is complete (this is where the idempotence of your operation would help).

Client network failures

If the client network is flaky and prone to frequent failures, the above proposed solution, where requests and responses are uncoupled, should work too!

First of all, if you send back immediate acknowledgements, you'd let the client know what's going on immediately; a quick response time should also make it more likely that the client receives the response.
Secondly, whenever any request is aborted due to a network failure, you could simply wait an appropriate amount of time (basically, enough time for the server to complete the operation) before trying again, as opposed to trying again right away. This way you would significantly increase the chance that the server will have finished the operation in question and you should get your response (again, this is where using idempotent requests are crucial).
If you did not wish to adjust the timeouts, or in case after retrying the operation you get a response saying Request in progress, you could try listening on the socket again.

Final thoughts

If using a socket is not an option, you could use polling. Polling isn't great but personally, I'd most likely still go with polling rather than rollbacks, especially if the server operations are slow - this would allow for decent pauses before retries.

The problem with rollbacks is that they'd try to recover from failures using code, which in itself is never foolproof. What if something goes wrong while rolling back? Can you make sure that the rollback is atomic and idempotent, and will never, under any circumstances, leave the system in an undefined state? That's beside the fact that they can be non-trivial to implement and would introduce additional complexity and extra code for testing and maintenance.

In case you don't own the client code

You'll have more trouble if you don't own the client code, as the consumer of your API would be free to make lots of arbitrary calls to your servers. In this case I would definitely lock idempotent operations and return responses saying that the request is being processed instead of trying to revert anything using rollbacks. Imagine having multiple concurrent requests and rollbacks! If you were not happy with Stanislav's proposal (The queue will get longer and longer, making the whole system slower, reducing the capacity of the system to serve requests.), I believe that this scenario would be even worse.

Bedplate answered 30/12, 2020 at 17:0 Comment(1)

I have left the company by the time, and I have no idea how and if this issue has been solved. However, the idea of immediate acknowledgment of 'In progress' status is probably a good option. – Lesbos 31/12, 2020 at 9:41

S

0

The request is processed by instance 1 and takes a long time

Consider splitting the process in 2 steps.

Step 1 storing the request and Step 2 processing the request. On the first request you just store all the request data somewhere (could be a DB or a Queue). Here you can introduce a statuses e.g. 'new', 'in progress', 'ready'. You can make them synchronous or asynchronous, no matter. So on the second attempt to process the same request you check whether it's already stored and status. Here you can respond with the status or just wait till the status becomes 'ready'. So in the filter you just check whether the request already exists (was previously stored) and if yes just get status and results (if it's ready) to be sent to response.

You can add a custom validation annotation - @UniqueRequest to the RequestDTO and add @Valid to check DB (see the example). No need to do this in a Filter but move the logic to Controller (it's part of validation in fact). It's up to you how to respond in this case - just check BindingResult.

Selfreproach answered 26/7, 2017 at 13:45 Comment(1)

Your solution about storing the request with a status has one drawback: at each request attempt, you will get a new process "waiting" for another to finish. The queue will get longer and longer, making the whole system slower, reducing the capacity of the system to serve requests. Read here About validation annotation, it looks like I have to write specific rollback code for each request. The code base is large, I prefer to avoid it. I will explore it though. – Lesbos 27/7, 2017 at 9:9

B

0

Based on

To avoid having too many errors for the end user, we made the client retry unsuccessful requests

you seem to have full control of the client code (great!) as well as the server.

It is, however, not clear whether the problem with the client's network is flakiness (the connection often randomly drops and requests are aborted) or slowness (timeouts), since you've mentioned both. So let's analyse both!

Timeouts

The first things that I'd recommend are:

adjusting the connection timeout on the server so that it is not closed before the server finishes the operation;
adjusting the request timeout on the client to account for the slow operation on the server and the slowness of the client's network.

However:

if the server operation is really slow and the maximum connection timeout (120s, is it?) is not enough;
or if you are also sending large requests/responses and the maximum client timeout is not enough;
or if you just don't want to increase the timeouts for any reason,

then the standard request-response scheme is probably not suitable.

In this case, instead of having the client wait for a response you could perhaps send back an immediate acknowledgement Request received and send the actual response via some TCP socket? Any following attempts would receive either a message saying that the Request is being processed, or the final response, if the operation is complete (this is where the idempotence of your operation would help).

Client network failures

If the client network is flaky and prone to frequent failures, the above proposed solution, where requests and responses are uncoupled, should work too!

First of all, if you send back immediate acknowledgements, you'd let the client know what's going on immediately; a quick response time should also make it more likely that the client receives the response.
Secondly, whenever any request is aborted due to a network failure, you could simply wait an appropriate amount of time (basically, enough time for the server to complete the operation) before trying again, as opposed to trying again right away. This way you would significantly increase the chance that the server will have finished the operation in question and you should get your response (again, this is where using idempotent requests are crucial).
If you did not wish to adjust the timeouts, or in case after retrying the operation you get a response saying Request in progress, you could try listening on the socket again.

Final thoughts

If using a socket is not an option, you could use polling. Polling isn't great but personally, I'd most likely still go with polling rather than rollbacks, especially if the server operations are slow - this would allow for decent pauses before retries.

The problem with rollbacks is that they'd try to recover from failures using code, which in itself is never foolproof. What if something goes wrong while rolling back? Can you make sure that the rollback is atomic and idempotent, and will never, under any circumstances, leave the system in an undefined state? That's beside the fact that they can be non-trivial to implement and would introduce additional complexity and extra code for testing and maintenance.

In case you don't own the client code

You'll have more trouble if you don't own the client code, as the consumer of your API would be free to make lots of arbitrary calls to your servers. In this case I would definitely lock idempotent operations and return responses saying that the request is being processed instead of trying to revert anything using rollbacks. Imagine having multiple concurrent requests and rollbacks! If you were not happy with Stanislav's proposal (The queue will get longer and longer, making the whole system slower, reducing the capacity of the system to serve requests.), I believe that this scenario would be even worse.

Bedplate answered 30/12, 2020 at 17:0 Comment(1)

I have left the company by the time, and I have no idea how and if this issue has been solved. However, the idea of immediate acknowledgment of 'In progress' status is probably a good option. – Lesbos 31/12, 2020 at 9:41

What I am trying to achieve

What I have done so far

The issue

Attempt on solution 1

Questions

Timeouts

Client network failures

Final thoughts

In case you don't own the client code

Timeouts

Client network failures

Final thoughts

In case you don't own the client code

Recommended topics

Hot tags