SSIS - Dynamically loop over multiple databases
Asked Answered
J

1

9

I have to consolidate data from from 1000+ databases having the same structure/tables in one unique DB. DBs may be added and removed potentially on a daily basis so I need to retrieve the list of DBs dynamically and run the dynamically generated SQL query to extract data on each of them.

I designed the Data Flow with a query from a variable that is working fine if executed with a static value:

enter image description here

With a SQL task I get the list of instances, I loop over the them and with a nested Foreach Loop/SQL task I retrieve the database names and create the dynamic SQL with the following statement (DB name is anonymized):

SELECT 'select ''' + name + ''' as DatabaseName, ID from ' + name + '.[dbo].[Orders] as querytext FROM sys.databases WHERE name LIKE ( 'XXX%_%' );

This part is also working fine:

enter image description here

How can I use the result of the SQL task "Execute SQL Task - Get query text" as query to be executed in the Source "OLE DB Source 1" (part of "Data Flow Task 3")?

I tried mapping an Object variable "User::SqlCommandFromSQLTask" in the result set of the SQL task, then set it up as ADO object source variable and with a Script task convert it to string and pass the value to the variable SqlStringFromSQLTask3 (used as source in "OLE DB Source 1") but I get the error Violation of PRIMARY KEY constraint, like if the data flow is always running with a static value I set up as default:

enter image description here

enter image description here

enter image description here

While, if I remove the value from the variable panel, I get the error "Command text was not set for the command object.", even changing the property DelayValidation of the Data Flow to false.

Any help is much appreciated.

Jodhpur answered 30/1, 2018 at 15:38 Comment(14)
Are all the databases on the same Instance? Also, are you able to share an example of the Dynamic SQL you are generating? I also assume that all the tables have the same DDL (if not, you're in for a world of "pain").Cryoscopy
Don't use script tasks to shred objects into strings. Configure the ForEach task instead. Provide one variable per column, using correct data types. Inside the loop, the task will populate with variable(s) from the current row. If source databases are all the same you avoid dynamic SQL. Use the loop to loop over the db and/or server names.Sphygmomanometer
What is the SQL command that you are using in the Execute SQL Task?Rumania
Also what is the Code that you are using in the Script Task?Rumania
Also how did you configured variable mappings in for each loop container?Rumania
@Larnu: All the DBs are on the same instance. I started from a simple Dynamic SQL that returns the following for example: "select count(*) as row_count from [xxx].[dbo].[Orders]" (db name anonymized indeed, xxx is the variable part). Confirm that the starting point is that all the tables across the databases have the same name and structure.Jodhpur
@destination-data: will try also that approach, but seems more rigid and time consuming, since I will have 30+ fields to extract once the base loop is working.Jodhpur
@Hadi: I use "SELECT 'select count(*) as row_count from ' + name + '.[dbo].[Orders]' as querytext FROM sys.databases WHERE name LIKE ( 'xxx%' );" as command for Execute SQL Task. Script task is "Dts.Variables("User::SqlStringFromSQLTask").Value = Convert.ToString(Dts.Variables("User::SqlCommandFromSQLTask").Value)", where User::SqlCommandFromSQLTask is ReadOnlyVariable and User::SqlStringFromSQLTask ReadWrite.Jodhpur
@Jodhpur please add these details to your question it will be more clearRumania
We really need proper details, and for them to go in the question, not on the comments. Please edit your question with them. I also, therefore, don't understand what your script task is achieving here. Why is it doing a COUNT(*) and why is it a script task?Cryoscopy
@Larnu: apologize I could not follow up earlier. I consolidated all the replies to comments in the initial post, hope it can shed more light on the issue.Jodhpur
Thanks, however, you haven't answered as to why you're using a script task, what it is meant to achieve, and why it's doing a COUNT(*)Cryoscopy
I used script task because it's the only method I know to generate dynamic sql queries based on DB name :-) I will replace the count(*) part since I understand it may raise confusion.Jodhpur
@Larnu: I updated the post with better structured details, hope that the goal and the approach taken is more clear now.Jodhpur
B
1

When I have used SSIS to connect to multiple SQL Server boxes, I have stored those SQL Server connection strings in a table in a central database. Then I use a query of that table as the input to the foreach loop data flow task. If we ever have to change a sql server connection string, which does happen, we just update that table with the newest value.

Boner answered 23/6, 2018 at 16:33 Comment(1)
Thanks, that also makes sense, We also don't add new servers frequently. For the moment I stick with the solution of dynamic SQL which is the only way I found to make it work.Jodhpur

© 2022 - 2024 — McMap. All rights reserved.