SSIS - Query a database based on result of query from another database
Problem:
I am using SSIS in VS 2013. I need to get a list of IDs from 1 database, and with that list of IDs, I want to query another database, ie
Solution:
Original Post: https://stackoverflow.com/questions/43746258/query-a-database-based-on-result-of-query-from-another-database/43988356#43988356
I am using SSIS in VS 2013. I need to get a list of IDs from 1 database, and with that list of IDs, I want to query another database, ie
SELECT ... from MySecondDB WHERE ID IN ({list of IDs from MyFirstDB})Solution:
There is 3 Methods to achieve this:
1st method - Using Lookup Transformation
First you have to add a
Lookup Transformation like @TheEsisia answered but there are more requirements:- In the Lookup you Have to write the query that contains the ID list (ex:
SELECT ID From MyFirstDB WHERE ...) - At least you have to select one column from the lookup table
- These will not filter rows , but this will add values from the second table
To filter rows
WHERE ID IN ({list of IDs from MyFirstDB}) you have to do some work in the look up error output Error case there are 2 ways:- set Error handling to
Ignore Rowso the added columns (from lookup) values will be null , so you have to add aConditional splitthat filter rows having values equal NULL.
Assuming that you have chosen
col1 as lookup column so you have to use a similar expressionISNULL([col1]) == False
- Or you can set Error handling to
Redirect Row, so all rows will be sent to the error output row, which may not be used, so data will be filtered
The disadvantage of this method is that all data is loaded and filtered during execution.
Also if working on network filtering is done on local machine (2nd method on server) after all data is loaded is memory.
2nd method - Using Script Task
To avoid loading all data, you can do a workaround, You can achieve this using a Script Task: (answer writen in VB.NET)
Assuming that the connection manager name is
TestAdo and "Select [ID] FROM dbo.MyTable" is the query to get the list of id's , and User::MyVariableList is the variable you want to store the list of id's
Note: This code will read the connection from the connection manager
Public Sub Main()
Dim lst As New Collections.Generic.List(Of String)
Dim myADONETConnection As SqlClient.SqlConnection
myADONETConnection = _
DirectCast(Dts.Connections("TestAdo").AcquireConnection(Dts.Transaction), _
SqlClient.SqlConnection)
If myADONETConnection.State = ConnectionState.Closed Then
myADONETConnection.Open()
End If
Dim myADONETCommand As New SqlClient.SqlCommand("Select [ID] FROM dbo.MyTable", myADONETConnection)
Dim dr As SqlClient.SqlDataReader
dr = myADONETCommand.ExecuteReader
While dr.Read
lst.Add(dr(0).ToString)
End While
Dts.Variables.Item("User::MyVariableList").Value = "SELECT ... FROM ... WHERE ID IN(" & String.Join(",", lst) & ")"
Dts.TaskResult = ScriptResults.Success
End Sub
And the
User::MyVariableList should be used as source (Sql command in a variable)3rd method - Using Execute Sql Task
Similar to the second method but this will build the IN clause using an
Execute SQL Task then using the whole query as OLEDB Source,- Just add an Execute SQL Task before the DataFlow Task
- Set
ResultSetproperty tosingle - Select
User::MyVariableListas Result Set - Use the following SQL command
DECLARE @str AS VARCHAR(4000) SET @str = '' SELECT @str = @str + CAST([ID] AS VARCHAR(255)) FROM dbo.MyTable SET @str = 'SELECT * FROM MySecondDB WHERE ID IN (' + SUBSTRING(@str,1,LEN(@str) - 1) + ')' SELECT @str
Make sure that you have set the
DataFlow Task Delay Validation property to TrueOriginal Post: https://stackoverflow.com/questions/43746258/query-a-database-based-on-result-of-query-from-another-database/43988356#43988356



Comments
Post a Comment