Correct concurrency handling using EF Core 2.1 with SQL Server
Asked Answered
I

3

8

I am currently working on an API using ASP.NET Core Web API along with Entity Framework Core 2.1 and a SQL Server database. The API is used to transfer money from two accounts A and B. Given the nature of the B account which is an account that accepts payments, a lot of concurrent requests might be executed at the same moment. As you know if it's not well managed, this can result in some users not seeing their payments arrive.

Having spent days trying to achieve concurrency I can't figure out what the best approach is. For the sake of simplicity I created a test project trying to reproduce this concurrency issue.

In the test project, I have two routes: request1 and request2 each one perform a transfer to the same user the first one have an amount of 10 and the second one is 20. I put a Thread.sleep(10000) on the first one as follows:

    [HttpGet]
    [Route("request1")]
    public async Task<string> request1()
    {
        using (var transaction = _context.Database.BeginTransaction(System.Data.IsolationLevel.Serializable))
        {
            try
            {
                Wallet w = _context.Wallets.Where(ww => ww.UserId == 1).FirstOrDefault();
                Thread.Sleep(10000);
                w.Amount = w.Amount + 10;
                w.Inserts++;
                _context.Wallets.Update(w);
                _context.SaveChanges();
                transaction.Commit();
            }
            catch (Exception ex)
            {
                transaction.Rollback();
            }
        }
        return "request 1 executed";
    }

    [HttpGet]
    [Route("request2")]
    public async Task<string> request2()
    {
        using (var transaction = _context.Database.BeginTransaction(System.Data.IsolationLevel.Serializable))
        {
            try
            {
                Wallet w = _context.Wallets.Where(ww => ww.UserId == 1).FirstOrDefault();
                w.Amount = w.Amount + 20;
                w.Inserts++;
                _context.Wallets.Update(w);
                _context.SaveChanges();
                transaction.Commit();
            }
            catch (Exception ex)
            {
                transaction.Rollback();
            }
        }
        return "request 2 executed";
    }

After executing request1 and request2 after in a browser, the first transaction is rolled back due to:

InvalidOperationException: An exception has been raised that is likely due to a transient failure. Consider enabling transient error resiliency by adding 'EnableRetryOnFailure()' to the 'UseSqlServer' call.

I can also retry the transaction but isn't there a better way? using locks ?

Serializable, being the most isolated level and the most costly too is as said in the documentation:

No other transactions can modify data that has been read by the current transaction until the current transaction completes.

Which means no other transaction can update data that has been read by another transaction, which is working as intended here since the update in the request2 route wait for the first transaction (request1) to commit.

The problem here is we need to block read by other transactions once the current transaction has read the wallet row, to solve the problem I need to use locking so that when the first select statement in request1 executes, all the transactions after need to wait for the 1st transaction to finish so they can select the correct value. Since EF Core have no support for locking I need to execute a SQL query directly, so when selecting the wallet I'll add a row lock to the current row selected

//this locks the wallet row with id 1
//and also the default transaction isolation level is enough
Wallet w = _context.Wallets.FromSql("select * from wallets with (XLOCK, ROWLOCK) where id = 1").FirstOrDefault();
Thread.Sleep(10000);
w.Amount = w.Amount + 10;
w.Inserts++;
_context.Wallets.Update(w);
_context.SaveChanges();
transaction.Commit();

Now this works perfectly even after executing multiple request the result of the transfers all combined is correct. In addition to that am using a transaction table that holds every money transfer made with the status to keep a record of each transaction in case something went wrong am able to compute all wallets amount using this table.

Now there are other ways of doing it like:

  • Stored procedure: but I want my logic to be in the application level
  • Making a synchronized method to handle the database logic: this way all the database requests are executed in a single thread, I read a blog post that advise about using this approach but maybe we'll use multiple servers for scalability

I don't know if I'm not searching well but I can't find good material for handling pessimistic concurrency with Entity Framework Core, even while browsing Github, most of code I've seen don't use locking.

Which bring me to my question: is this the correct way of doing it?

Cheers and thanks in advance.

Increscent answered 2/12, 2018 at 21:31 Comment(17)
You see the value 20 because the second request succeeds and the first is rolled back, due to an optimistic concurrency failure. You don't do anything based on your exception; a simple option might be to just retry the transaction. However... you might want to consider an event-sourced approach instead; i.e. store the transactions that make up the Wallet state, rather than the current state, in an append-only log. Much like an accounting ledger, and pretty-much every banking system, would do it.Oftentimes
@Oftentimes thanks for your time, indeed I had a break point on the transaction.rollback() line but when using the lock it seems to work correctly. If I did understand what you said correctly, I do have a table where each operation is stored it contains walletIdFrom, walletIdTo, amount and timestamp, But I want to use the wallet table so when the user request his balance I can fetch it directly from the wallet table instead of summing all operations amount.Increscent
Have a read around the topic of event sourcing anyway; it's too big for me to write about in comments here. There are concepts such as read-models and snapshots that cater to your desire not to read the whole log to get a balance (a common concern). I'm probably not exaggerating to say that almost every system doing this sort of thing, involving significant amounts of other people's money, is essentially event sourced. It's not an easy step to take, but if your system's non-functional requirements require it, you might need to consider it.Oftentimes
Thank you @sellotape, your help is appreciated. I am reading about it right now, this is what i intended to do with my transaction table just in a more classic way. However, It still really obscure to me how others do it, is a log (transactions) table is enough ? Because in my case users will need to see payments in real time and can receive a lot of payments during one day,(1k-10k payments) maybe keeping the logs and updating the accounts table at the end of a business day ? Of course am not expecting a straight answer but any material that could help is welcome. Thank you again man!Increscent
Serializable isolation level is more costly, but not THAT MUCH. After all, serialised access to the wallets affected by a transaction is exactly what you are trying to achieve here (even if you don't understand it yet :) ).Proterozoic
@RogerWolf thanks for your comment, do you mean that in this specific scenario I described in my question, I should only retry the transaction that fails, my only concern is if there are a lot of transactions at the same time, the user will need to wait until their transaction passes, and also in this case the transaction that first got in is the last one to be executed. thanks againIncrescent
@Reda, apologies, I haven't noticed that you are already using Serializable isolation level. Yes, this behaviour is sort of expected although suspicious, but I don't think you will run into this kind of scenario (long-running transaction followed by a short one). You can achieve essentially the same by using xlock, rowlock, holdlock when reading your wallets' rows even under RC, and unless EF is jumping above its own head behind the curtains, second transaction should wait on that lock until the first one completes. I would set up a SQL Profiler trace to see what actually hits the databaseProterozoic
Personally, when I had this kind of task back in a day, I created a stored procedure that encapsulated the entire transaction, along with the hints, locks, isolation levels, error handling, etc. Since you can never be completely sure what EF will do with the database, you might eventually end up doing the same.Proterozoic
@RogerWolf thanks for your comment, yes I intended to use stored procedure, but I just want my logic to be in the app. The nature of this project, is that users can pay with their phone, so a row lock on the customer row is fine, but in an account that accepts multiple payment (5k-10k) or maybe more from multiple apps, would it be scalable using locks, also I think serializable here is a lot for this task, if using locks default isolation level is fine. Am new to this on sql server and what surprises me is that this area is not well documented.Increscent
@Reda, why, it's documented - learn.microsoft.com/en-us/sql/t-sql/statements/… for example. It's just sufficiently complicated to not being able to fit into a comment on SO. And if you think that mass credits will create a bottleneck, you can serialise them by routing them through a queue, either your own or a Service Broker one. Just make sure that all debits are checked against the current state of the wallet :)Proterozoic
So you are saying I should only stick the serializable and dont use locks?Increscent
Thought proces: transfer money from A to B, lock DB table during transcation on DB on side B. When concurrent call is made, something goes wrong. -> why dont you place all incoming requests in a QUEUE and execute them one at a time?Stover
@Stover Same thing, the queue need to be in the webserver level, so if two transaction with the same destination account are processed in the same time this might be a problem. One other solution might be the queue bus as described in the microservices architecture. But am not going there for the moment. Thank you for your timeIncrescent
@Increscent I meant something more like, whenever a DB operation is ready to be exectued (wether its an insert, update or whatever) do not immedietly execute, instead put everything straight into the queue. then whenever the db is done executing a transaction pull the first thing from the queue. Basically its a funnel which prevents any concurrent transactions. this also prevents deadlocks. So in your case if 2 transactions with the same destination account come in, they both go in the queue and will be dealt with sequentially. First received is first executed.Stover
@Stover the only problem is the user experience here, the user needs to get instant feedback if his payment has been processed or not. But it's a great approach especially if one is using a microservices architecture. Thank you for your time :)Increscent
@Increscent Once in the queue you could query the length of the queue and the transactions place in it and return a "processing" state to the user as feedback along with a number. Besides, if you write any incoming transactions into a logfile you can always retrieve them later in the event of a critical system failure.Stover
I already have that log table. But in case the transactions per second amount is big 1k and plus. It s gonna be a long waiting time. I ccant just tell the user it s okey because what if they dont have enough funds to make the payments for example. The locking level has to be in the database for thisIncrescent
L
1

My suggestion for you is to catch on DbUpdateConcurrencyException and use entry.GetDatabaseValues(); and entry.OriginalValues.SetValues(databaseValues); into your retry logic. No need to lock the DB.

Here is the sample on EF Core documentation page:

using (var context = new PersonContext())
{
    // Fetch a person from database and change phone number
    var person = context.People.Single(p => p.PersonId == 1);
    person.PhoneNumber = "555-555-5555";

    // Change the person's name in the database to simulate a concurrency conflict
    context.Database.ExecuteSqlCommand(
        "UPDATE dbo.People SET FirstName = 'Jane' WHERE PersonId = 1");

    var saved = false;
    while (!saved)
    {
        try
        {
            // Attempt to save changes to the database
            context.SaveChanges();
            saved = true;
        }
        catch (DbUpdateConcurrencyException ex)
        {
            foreach (var entry in ex.Entries)
            {
                if (entry.Entity is Person)
                {
                    var proposedValues = entry.CurrentValues;
                    var databaseValues = entry.GetDatabaseValues();

                    foreach (var property in proposedValues.Properties)
                    {
                        var proposedValue = proposedValues[property];
                        var databaseValue = databaseValues[property];

                        // TODO: decide which value should be written to database
                        // proposedValues[property] = <value to be saved>;
                    }

                    // Refresh original values to bypass next concurrency check
                    entry.OriginalValues.SetValues(databaseValues);
                }
                else
                {
                    throw new NotSupportedException(
                        "Don't know how to handle concurrency conflicts for "
                        + entry.Metadata.Name);
                }
            }
        }
    }
}
Llovera answered 10/12, 2018 at 10:23 Comment(1)
Thank you for your answer. You are describing the optimistic concurrency approache, which is not suitable for my case. If a concurrency exception happens with your code, I should rollback both transactions. which sucks because both the end users will receive an error stating that the transaction has failed processing.the transfer is a 2 sec operation... Am sure if you think of it you'll find out that it's not a suitable solution. Thanks for your time thoIncrescent
T
-1

You can use distributed lock mechanism with redis for example. Also, you can lock by userId, it will not lock method for others.

Twocycle answered 29/8, 2022 at 6:29 Comment(4)
Please explain a bit more.Entertainer
The basic idea is you are locking scope, which guarantees that there will be a transaction at the same time. c-sharpcorner.com/UploadFile/1d42da/thread-locking-in-C-Sharp But if you need to scale your system you must use distributed lock actually it is using redis github.com/samcook/RedLock.netTwocycle
Please add that to the answer. That said, thread locking is often a bad idea and absolutely unnecessary in the question's case.Entertainer
Why is that a bad idea, they are not fixing the problem, just throwing exceptions if has a concurrency problem. Also, you can lock by userId, it is not a locking method for others.Twocycle
M
-5

Why don't you handle the concurrency problem in the code, why it needs to be in the DB layer?

You can have a method that updates the value of given wallet with given value and you can use simple lock there. Like this:

private readonly object walletLock = new object();

public void UpdateWalletAmount(int userId, int amount)
{
    lock (balanceLock)
    {
         Wallet w = _context.Wallets.Where(ww => ww.UserId == userId).FirstOrDefault();
         w.Amount = w.Amount + amount;
         w.Inserts++;
         _context.Wallets.Update(w);
         _context.SaveChanges();
    }
}

So your methods will look like this:

[HttpGet]
[Route("request1")]
public async Task<string> request1()
{
    try
    {
        UpdateWalletAmount(1, 10);
    }
    catch (Exception ex)
    {
        // log error
    }
    return "request 1 executed";
}

[HttpGet]
[Route("request2")]
public async Task<string> request2()
{
    try
    {
        UpdateWalletAmount(1, 20);
    }
    catch (Exception ex)
    {
        // log error
    }
    return "request 2 executed";
}

You don't even need to use a transaction in this context.

Melanson answered 6/12, 2018 at 13:41 Comment(2)
When you scale you ll probably need more webservers. So I think you got why i dont want to do thatIncrescent
It's clearly stated in the question, that it's not the "Correct" way of handling concurrency, this will mislead a lot of people in the future ...Increscent

© 2022 - 2024 — McMap. All rights reserved.