My Azure role grabs stuff to process from a database - it holds an instance of System.Data.SqlClient.SqlConnection
and periodically creates an SqlCommand
instance and executes an SQL query.
Now once in a while (usually once in several days) running a query will trigger an SqlException
exception
The service has encountered an error processing your request. Please try again. Error code 40143. A severe error occurred on the current command. The results, if any, should be discarded.
Which I've already seen many times and now my code catches it, calls Dispose()
on the SqlConnection
instance and then reopens the connection and retries the query. The latter typically results in another SqlException
exception
Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding.
Which looks pretty much like SQL Azure server not responding or being unavailable for whatever reason.
Currently my code doesn't catch the latter exception, it is propagated outside RoleEntryPoint.Run()
and the role is restarted. Restart typically takes about ten minutes and once it completes the problem is gone for a day or so.
I don't like my role restarting - it's takes a while and my service functionality is hindered. I'd like to do something smarter.
What would be a strategy to address this problem? Should I retry the query several times and how many times and with what interval? Should I do something else? When do I give up and let the role just restart?