How does SSIS manage closing connections? Can I force it?
Asked Answered
B

1

6

tl;dr version

I get errors when using OLE DB (SNC10.0) connection managers after a few nights of running, could connections not be properly timing out? Switching to ADO.NET Connection Managers and sources seems to fix it, why?

I apologize for the generic title but there are too many details to state in a single line.

Technology:

In all cases the database server, both source and destination are SQL Server 2008 R2

The setup:

I have a set of SSIS packages that run one after another in the middle of the night. There are 7 of them currently. They all perform a similar set of tasks: they first connect to a source database and copy the data to a staging database. Then they do various transformations within the staging database. Finally the process connects to a target database and fills it with data.

I set up all the connections as OLE DB connections (SQL Native Client 10.0) so that I can use them with Lookup components and other OLE-specific components.

The problem:

We have been experiencing issues repeatedly with our automated runs of the SSIS package. Generally I will test it running manually from my station, it will run fine; then we'll save the SSIS package into the SQL Server and schedule it and it will run fine. A few nights later, we will get an issue such as:

SSIS Error Code DTS_E_OLEDBERROR. An OLE DB error has occurred. Error code: 0x80004005. An OLE DB record is available. Source: "Microsoft OLE DB Provider for SQL Server" Hresult: 0x80004005 Description: "Protocol error in TDS stream".

or

SSIS Error Code DTS_E_OLEDBERROR. An OLE DB error has occurred. Error code: 0x80004005. An OLE DB record is available. Source: "Microsoft SQL Server Native Client 10.0" Hresult: 0x80004005 Description: "Unknown token received from SQL Server".

When searched for online, both of these point toward connection issues and specifically network connectivity problems.

Work-around:

I found that a quick, if not always simple, solution to these issues is to replace the source nodes with ADO NET Sources rather than OLE DB Sources. This is acceptable within my Data Flow tasks for some cases, but in the cases where I need to use a Lookup component, or some other such tool that only works with the OLE sources, this is not a good enough solution if I will still encounter these issues.

Question:

I know there are tons of differences between ADO.NET and OLE DB connections, but one primary thing I noticed is that the OLE DB connection manager has two timeouts, both defaulted to the value of '0' which generally means disabled (no timeout). The ADO.NET connection manager has a single timeout and it's set to the value of '15' (15 seconds).

How do these two connection managers handle the time-outs and closing connections? With a value of 0 in an OLE DB connection manager timeout, will that connection never be closed unless something is done on the SQL Server? Could this be part of my issue, with so many data-flow tasks opening OLE DB connections and then not being closed? Is there anything I can do in an SSIS package to forcefully close these connections?

****EDIT****

Here is a screenshot of the data flow task in question. I have altered some of the names to protect the innocent, etc.

enter image description here

The task as pictured here will run completely fine and works 100% of the time. If I change that ADO.NET Source to an OLE DB Source I get the errors mentioned in the post. I have, in some other cases, went through and eliminated Lookups by expanding the source query. In this task I have not.

Bagatelle answered 5/12, 2012 at 22:32 Comment(5)
We like details here ;P Just to get the basics out of the way: how are your systems doing patch-wise? Are you manually creating OLE DB connections from Script Tasks/Components? Assuming the package doesn't fail, how long does it typically run? Are you using MSDTC to handle transactions? Does it consistently fail on a specific package? How about consistent failure on a specific task?Galvanic
Thank you for the questions. Okay, patch-wise I will verify but I believe they are up to date. I am manually creating OLE DB connection managers down in the bottom pane, then inside data flow tasks creating OLE DB Data Source component instances and pointing them to the connection managers. The package that is the only one failing currently runs for ~30 minutes. I am not using MSDTC and not actually doing anything transactional; currently we restore on failure. It consistently fails on a particular data-flow task, which is the largest from the perspective of # of rows (~3 million rows)Bagatelle
Of note, the package runs for 30 minutes, accessing two different databases on two servers, runs through 7 different 'entities' of data, the top level objects, it is transferring only 300-500 rows, and then the bottom-most object in the graph is the afore-mentioned ~3 million row table.Bagatelle
Could you pop a screenshot of the dataflow task that is having the failures?Galvanic
Certainly, thanks @GalvanicBagatelle
B
3

We found out what the source of all our issues were and the reason the description of problems didn't match the description of the environment and there were not enough clues to solve.

Finally everything blew up and we found out that "nothing has changed on the network or servers" was not the case.

There was a backup occurring in the middle of our jobs. That backup was using volume shadow copy and was backing up both the production databases as well as tempdb. Because of disk IO issues/locks, the tempdb had grown to half a terrabyte due to unwritten changes and then that was further attempting to be shadow-copied.

Turning off the backup/shadow copy on tempdb and the production database caused the jobs to go through immediately. Queries that were taking > 30 minutes are now < 1 minute.

Thank you guys for sticking with me and thinking through it.

Bagatelle answered 19/12, 2012 at 16:49 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.