SSIS - Query a database based on result of query from another database
Problem:
I am using SSIS in VS 2013. I need to get a list of IDs from 1 database, and with that list of IDs, I want to query another database, ie
Solution:
Original Post: https://stackoverflow.com/questions/43746258/query-a-database-based-on-result-of-query-from-another-database/43988356#43988356
I am using SSIS in VS 2013. I need to get a list of IDs from 1 database, and with that list of IDs, I want to query another database, ie
SELECT ... from MySecondDB WHERE ID IN ({list of IDs from MyFirstDB})
Solution:
There is 3 Methods to achieve this:
1st method - Using Lookup Transformation
First you have to add a
Lookup Transformation
like @TheEsisia answered but there are more requirements:- In the Lookup you Have to write the query that contains the ID list (ex:
SELECT ID From MyFirstDB WHERE ...
) - At least you have to select one column from the lookup table
- These will not filter rows , but this will add values from the second table
To filter rows
WHERE ID IN ({list of IDs from MyFirstDB})
you have to do some work in the look up error output Error case
there are 2 ways:- set Error handling to
Ignore Row
so the added columns (from lookup) values will be null , so you have to add aConditional split
that filter rows having values equal NULL.
Assuming that you have chosen
col1
as lookup column so you have to use a similar expressionISNULL([col1]) == False
- Or you can set Error handling to
Redirect Row
, so all rows will be sent to the error output row, which may not be used, so data will be filtered
The disadvantage of this method is that all data is loaded and filtered during execution.
Also if working on network filtering is done on local machine (2nd method on server) after all data is loaded is memory.
2nd method - Using Script Task
To avoid loading all data, you can do a workaround, You can achieve this using a Script Task: (answer writen in VB.NET)
Assuming that the connection manager name is
TestAdo
and "Select [ID] FROM dbo.MyTable"
is the query to get the list of id's , and User::MyVariableList
is the variable you want to store the list of id's
Note: This code will read the connection from the connection manager
Public Sub Main()
Dim lst As New Collections.Generic.List(Of String)
Dim myADONETConnection As SqlClient.SqlConnection
myADONETConnection = _
DirectCast(Dts.Connections("TestAdo").AcquireConnection(Dts.Transaction), _
SqlClient.SqlConnection)
If myADONETConnection.State = ConnectionState.Closed Then
myADONETConnection.Open()
End If
Dim myADONETCommand As New SqlClient.SqlCommand("Select [ID] FROM dbo.MyTable", myADONETConnection)
Dim dr As SqlClient.SqlDataReader
dr = myADONETCommand.ExecuteReader
While dr.Read
lst.Add(dr(0).ToString)
End While
Dts.Variables.Item("User::MyVariableList").Value = "SELECT ... FROM ... WHERE ID IN(" & String.Join(",", lst) & ")"
Dts.TaskResult = ScriptResults.Success
End Sub
And the
User::MyVariableList
should be used as source (Sql command in a variable)3rd method - Using Execute Sql Task
Similar to the second method but this will build the IN clause using an
Execute SQL Task
then using the whole query as OLEDB Source
,- Just add an Execute SQL Task before the DataFlow Task
- Set
ResultSet
property tosingle
- Select
User::MyVariableList
as Result Set - Use the following SQL command
DECLARE @str AS VARCHAR(4000) SET @str = '' SELECT @str = @str + CAST([ID] AS VARCHAR(255)) FROM dbo.MyTable SET @str = 'SELECT * FROM MySecondDB WHERE ID IN (' + SUBSTRING(@str,1,LEN(@str) - 1) + ')' SELECT @str
Make sure that you have set the
DataFlow Task
Delay Validation
property to True
Original Post: https://stackoverflow.com/questions/43746258/query-a-database-based-on-result-of-query-from-another-database/43988356#43988356
Comments
Post a Comment