Hangfire causing locks in SQL Server
Asked Answered
D

3

12

We are using Hangfire 1.7.2 within our ASP.NET Web project with SQL Server 2016. We have around 150 sites on our server, with each site using Hangfire 1.7.2. We noticed that when we upgraded these sites to use Hangfire, the DB server collapsed. Checking the DB logs, we found out there were multiple locking queries. We have identified one RPC Event “sys.sp_getapplock;1” In the all blocking sessions. It seems like Hangfire is locking our DB rendering whole DB unusable. We noticed almost 670+ locking queries because of Hangfire.

This could possibly be due to these properties we setup:

   SlidingInvisibilityTimeout = TimeSpan.FromMinutes(30),
   QueuePollInterval = TimeSpan.FromHours(5)

Each site has around 20 background jobs, a few of them run every minute, whereas others every hour, every 6 hours and some once a day.

I have searched the documentation but could not find anything which could explain these two properties or how to set them to avoid DB locks.

Looking for some help on this.

EDIT: The following queries are executed at every second:

exec sp_executesql N'select count(*) from [HangFire].[Set] with (readcommittedlock, forceseek) where [Key] = @key',N'@key nvarchar(4000)',@key=N'retries'

select distinct(Queue) from [HangFire].JobQueue with (nolock)

exec sp_executesql N'select count(*) from [HangFire].[Set] with (readcommittedlock, forceseek) where [Key] = @key',N'@key nvarchar(4000)',@key=N'retries'

irrespective of various combinations of timespan values we set. Here is the code of GetHangfirServers we are using:

  public static IEnumerable<IDisposable> GetHangfireServers()
    {
        // Reference for GlobalConfiguration.Configuration: http://docs.hangfire.io/en/latest/getting-started/index.html
        // Reference for UseSqlServerStorage: http://docs.hangfire.io/en/latest/configuration/using-sql-server.html#configuring-the-polling-interval
        GlobalConfiguration.Configuration
            .SetDataCompatibilityLevel(CompatibilityLevel.Version_170)

            .UseSimpleAssemblyNameTypeSerializer()
            .UseRecommendedSerializerSettings()
            .UseSqlServerStorage(ConfigurationManager.ConnectionStrings["abc"]
                .ConnectionString, new SqlServerStorageOptions
            {
                CommandBatchMaxTimeout = TimeSpan.FromMinutes(5),
                SlidingInvisibilityTimeout = TimeSpan.FromMinutes(30),
                QueuePollInterval = TimeSpan.FromHours(5), // Hangfire will poll after 5 hrs to check failed jobs.
                UseRecommendedIsolationLevel = true,
                UsePageLocksOnDequeue = true,
                DisableGlobalLocks = true
            });

        // Reference: https://docs.hangfire.io/en/latest/background-processing/configuring-degree-of-parallelism.html
        var options = new BackgroundJobServerOptions
        {
            WorkerCount = 5
        };

        var server = new BackgroundJobServer(options);

        yield return server;
    }

The worker count is set just to 5.

There are just 4 jobs and even those are completed (SELECT * FROM [HangFire].[State]): enter image description here

Do you have any idea why the Hangfire is hitting so many queries at each second?

Dodd answered 27/5, 2019 at 7:46 Comment(5)
It is not a problem of Hangfire jobs, probably it your db structure that got locked on queries, try to execute actions without hangfire Jobs, your will see that lock still happens try to resolve your locking in db structure , it is not Hangfire problem.Jaynejaynell
Also check your Hangfire Job Dashboard, Are jobs executing successfully??Jaynejaynell
Are the sp_getapplock queries issued by your application queries or hangfire?Froissart
sp_getapplock issue is because of HangFireDodd
QueuePollInterval is the time interval Hnagfire will hit the DB. But what is "SlidingInvisibilityTimeout"?Brietta
P
11

We faced this issue in one of our projects. The hangfire dashboard is pretty read heavy and it polls the hangfire db very frequently to refresh job status.

Best solution that worked for us was to have a dedicated hangfire database. That way you will isolate the application queries from hangfire queries and your application queries won't be affected by the hangfire server and dashboard queries.

Pammy answered 3/6, 2019 at 8:46 Comment(2)
The other good idea is to change DashboardOptions.StatsPollingInterval to a value like 600000 (10 minutes). That slows down how often the hangfire dashboard will automatically refresh (you can always manually refresh).Eppes
We have a dedicated hangfire database but today I encountered this issue in our production environment; as well as all jobs failing with "DistributedLockTimeoutException", it was also affecting the behaviour of our other services that don't reference the Hangfire database. My guess is that as well as locking resources in the Hangfire database, Hangfire was also locking resources in the main database we use (which our Hangfire jobs read and write to).Henden
D
2

There is a newer configuration option called SlidingInvisibilityTimeout when configuring SqlServerStorage that causes these database locks as part of newer fetching non-transactional message fetching algorithm. It is meant for long running jobs that may cause backups of transactional logs to error out (as there is a database transaction that is still active as part of the long running job).

.UseSqlServerStorage(
    "connection_string", 
    new SqlServerStorageOptions { SlidingInvisibilityTimeout = TimeSpan.FromMinutes(5) });

Our DBA did not like the database locks, so I just removed this SlidingInvisibilityTimeout option to use the old transactional based message fetching algorithm since I didn't have any long running jobs in my queue.

Whether you enable this option or not is dependent on your situation. You may want to consider moving your queue database outside of your application database if it isn't already and enable the SlidingInvisibilityTimeout option. If your DBA can't live with the locks even if the queue is a separate database, then maybe you could refactor your tasks into many more smaller tasks that are shorter lived. Just some ideas.

https://www.hangfire.io/blog/2017/06/16/hangfire-1.6.14.html

Douty answered 27/8, 2019 at 14:45 Comment(0)
E
0

SqlServerStorage runs Install.sql which takes an exclusive schema lock on the Hangfire-schema.

DECLARE @SchemaLockResult INT;
EXEC @SchemaLockResult = sp_getapplock @Resource = '$(HangFireSchema):SchemaLock', 
@LockMode = 'Exclusive'

From the Hangfire documentation:

"SQL Server objects are installed automatically from the SqlServerStorage constructor by executing statements described in the Install.sql file (which is located under the tools folder in the NuGet package). Which contains the migration script, so new versions of Hangfire with schema changes can be installed seamlessly, without your intervention."

If you don't want to run this script everytime you could set SqlServerStorageOptions.PrepareSchemaIfNecessary to false.

var options = new SqlServerStorageOptions
{
    PrepareSchemaIfNecessary = false
};

var sqlServerStorage = new SqlServerStorage(connectionstring, options);

Instead run the Install.sql manually by using this line:

SqlServerObjectsInstaller.Install(connection);
Exasperation answered 3/12, 2019 at 16:1 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.