![]() | |
![]() |
| | Thread Tools | Display Modes |
#1
| |||
| |||
|
#2
| |||
| |||
|
|
I've used DTS for simple tasks for years, largely importing and translating data from one database to another. However, I've never really scratched the surface of what DTS can do. Now I have some need to do a bit more, and I'm looking for a little advice on where to start looking, and what methods you might suggest. Here's the situation. I have a number of local (but identical structure) databases that I need to syncronize (one way - remote to local) with a remote database. I don't have direct network access to the SQL Server, but to a front end web server. I've written a program to do this already, using C#, but am not really happy with the results. I'd like to convert this to a DTS solution. My current solution uses a Web Service to retrieve the data from the remote server (over the internet). As an intermediate step, I copy this data to a local scratch database. This is rather straight forward, since I don't need to massage the data at all. I can do a direct copy of data that has been updated since the last poll, right down to using the same primary keys. Here's where the situation gets more complicated. Each local database is independant, and may have new records added or removed outside of the syncronization process, so I can't just copy the rows using the remote databases keys. The keys will be different. I have to somehow map the remote keys to local keys for updates. I also have to generate new keys for inserts (not a problem, there's stored procedures to do that. It doesn't use identity fields). In my scratch database I keep a mapping of all records I've previously imported and where they map in each database. Then I do a translation, compare each field for changes, and update those fields that have changed, or insert new records. A complication is that I can't just update each field, regardless of whether it's changed or not because there are some triggers that should only be called when a field has actually changed. I can't rewrite these triggers because they're part of a third party solution. My first thought was that I could setup triggers on my scratch database so that when rows are updated, they automatically call a DTS package that updates the other databases. Then, I can still run my program that updates the scratch database, but leave the rest up to SQL Server. Because of my relative unfamiliarity with more complex DTS programming, I'm at a loss to figure out the best way to design this process. Questions that arise are: Can you programmatically tell a DTS package which databases to update? Yes Can you programmatically pass a list of databases to a DTS package and get it to update all of them in the same manner? Yes An alternative solution might be to use a table in my scratch database to list the databases the DTS package should update. Can you update the connection information from within the DTS package based on a field in a table? Yes Can you call a DTS package from a trigger? Yes I assume this is the only way to get SQL Server to execute cross-database queries. Is there a better way to do this? (not sure what exaclty you mean) Any suggestions on how to approach this problem? Any comments on my approach? Thanks. |
#3
| |||
| |||
|
|
Erik, This kind of synchronization is never fun nor is it ever straight forward. First, it may help to know that you can disable triggers temporarily do your work and reenable them. ALTER TABLE tablename DISABLE TRIGGER ALL -- do sychronization work here ALTER TABLE tablename ENABLE TRIGGER ALL This would prevent your sychronization work from triggering your triggers (I apologize for the rudundancy ![]() In regards to mapping id's I wonder if you could just have a global lookup table in a separate database or in all the local databases: RemoteLocalIdLookup Table Columns ---------------- LocalDatabase RemoteKey LocalKey Obviously you will have to fill this table upon creation with the existing matches. But it would be a clean solution moving forward. If the local record isn't part of the synchronization process it shouldn't be in the table. (I realize your situation may not be this easy, just a thought that came to mind) I'm not sure why exactly you want to use dts for this solution...Can you explain why you've decided to use dts? How often does the synchronization occur (how many times per day)? You may want to set up a sql job to copy over the remote database (the job can call your webservice if you plan to keep that code). Then use T-SQL to update the local database tables from the remote/scratch database tables. When you ask about cross database queries are you refering to quering 2 tables from different databases? If so that can be done by denoting the database name with 2 decimals. The following query would return all of the data from database1..table1 that are not in database2..table1. Select a.* From database1..table1 a Left Join database2..table1 b On (a.tableid = b.tableid) Where b.tableid is null I've given yes or no answers to your questions below, and I'd be happy to discuss this further and provide more details on your questions if you choose to use DTS as you're solution. Jason Erik Funkenbusch wrote: I've used DTS for simple tasks for years, largely importing and translating data from one database to another. However, I've never really scratched the surface of what DTS can do. Now I have some need to do a bit more, and I'm looking for a little advice on where to start looking, and what methods you might suggest. Here's the situation. I have a number of local (but identical structure) databases that I need to syncronize (one way - remote to local) with a remote database. I don't have direct network access to the SQL Server, but to a front end web server. I've written a program to do this already, using C#, but am not really happy with the results. I'd like to convert this to a DTS solution. My current solution uses a Web Service to retrieve the data from the remote server (over the internet). As an intermediate step, I copy this data to a local scratch database. This is rather straight forward, since I don't need to massage the data at all. I can do a direct copy of data that has been updated since the last poll, right down to using the same primary keys. Here's where the situation gets more complicated. Each local database is independant, and may have new records added or removed outside of the syncronization process, so I can't just copy the rows using the remote databases keys. The keys will be different. I have to somehow map the remote keys to local keys for updates. I also have to generate new keys for inserts (not a problem, there's stored procedures to do that. It doesn't use identity fields). In my scratch database I keep a mapping of all records I've previously imported and where they map in each database. Then I do a translation, compare each field for changes, and update those fields that have changed, or insert new records. A complication is that I can't just update each field, regardless of whether it's changed or not because there are some triggers that should only be called when a field has actually changed. I can't rewrite these triggers because they're part of a third party solution. My first thought was that I could setup triggers on my scratch database so that when rows are updated, they automatically call a DTS package that updates the other databases. Then, I can still run my program that updates the scratch database, but leave the rest up to SQL Server. Because of my relative unfamiliarity with more complex DTS programming, I'm at a loss to figure out the best way to design this process. Questions that arise are: Can you programmatically tell a DTS package which databases to update? Yes Can you programmatically pass a list of databases to a DTS package and get it to update all of them in the same manner? Yes An alternative solution might be to use a table in my scratch database to list the databases the DTS package should update. Can you update the connection information from within the DTS package based on a field in a table? Yes Can you call a DTS package from a trigger? Yes I assume this is the only way to get SQL Server to execute cross-database queries. Is there a better way to do this? (not sure what exaclty you mean) Any suggestions on how to approach this problem? Any comments on my approach? Thanks. |
#4
| |||
| |||
|
|
Jason, Thanks for your responses. I wasn't aware you could do cross database queries and joins. That will really simplify my solution (right now i'm repeating the process for each database). I had assumed that the only way to do cross-database queries was to use something like DTS, i'm glad i was wrong. Yes, I do already keep a table in my scratch database that maps the records, it's exactly as you suggest. I have a few more questions. When you say I can call my web service from a SQL Job... how? My web service returns an ADO.NET dataset. I can't quite seem to imagine how I could get a T-SQL job to parse this and create recordsets I could use to populate my scratch database with. I know I could use SQLIIS (or whatever it's called) to do queries against the remote database, but it's very expensive to do this and I want to return all data in one resultset that I can then work with locally. I'm also hesitant to use such technology without knowing the security implications. I understand web services pretty well. Or are you suggesting that I call an executable from a SQL Job that copies my remote database and populates my local scratch db? Syncronization is frequent, as often as every minute. I'd like to figure out a way to make this event driven, so that it only updates when there's new data, but I think I have to have at least some kind of polling. You've given me a lot to chew on, Thanks. On 9 Oct 2006 15:30:27 -0700, JasonPirtle (AT) gmail (DOT) com wrote: Erik, This kind of synchronization is never fun nor is it ever straight forward. First, it may help to know that you can disable triggers temporarily do your work and reenable them. ALTER TABLE tablename DISABLE TRIGGER ALL -- do sychronization work here ALTER TABLE tablename ENABLE TRIGGER ALL This would prevent your sychronization work from triggering your triggers (I apologize for the rudundancy ![]() In regards to mapping id's I wonder if you could just have a global lookup table in a separate database or in all the local databases: RemoteLocalIdLookup Table Columns ---------------- LocalDatabase RemoteKey LocalKey Obviously you will have to fill this table upon creation with the existing matches. But it would be a clean solution moving forward. If the local record isn't part of the synchronization process it shouldn't be in the table. (I realize your situation may not be this easy, just a thought that came to mind) I'm not sure why exactly you want to use dts for this solution...Can you explain why you've decided to use dts? How often does the synchronization occur (how many times per day)? You may want to set up a sql job to copy over the remote database (the job can call your webservice if you plan to keep that code). Then use T-SQL to update the local database tables from the remote/scratch database tables. When you ask about cross database queries are you refering to quering 2 tables from different databases? If so that can be done by denoting the database name with 2 decimals. The following query would return all of the data from database1..table1 that are not in database2..table1. Select a.* From database1..table1 a Left Join database2..table1 b On (a.tableid = b.tableid) Where b.tableid is null I've given yes or no answers to your questions below, and I'd be happy to discuss this further and provide more details on your questions if you choose to use DTS as you're solution. Jason Erik Funkenbusch wrote: I've used DTS for simple tasks for years, largely importing and translating data from one database to another. However, I've never really scratched the surface of what DTS can do. Now I have some need to do a bit more, and I'm looking for a little advice on where to start looking, and what methods you might suggest. Here's the situation. I have a number of local (but identical structure) databases that I need to syncronize (one way - remote to local) with a remote database. I don't have direct network access to the SQL Server, but to a front end web server. I've written a program to do this already, using C#, but am not really happy with the results. I'd like to convert this to a DTS solution. My current solution uses a Web Service to retrieve the data from the remote server (over the internet). As an intermediate step, I copy this data to a local scratch database. This is rather straight forward, since I don't need to massage the data at all. I can do a direct copy of data that has been updated since the last poll, right down to using the same primary keys. Here's where the situation gets more complicated. Each local database is independant, and may have new records added or removed outside of the syncronization process, so I can't just copy the rows using the remote databases keys. The keys will be different. I have to somehow map the remote keys to local keys for updates. I also have to generate new keys for inserts (not a problem, there's stored procedures to do that. It doesn't use identity fields). In my scratch database I keep a mapping of all records I've previously imported and where they map in each database. Then I do a translation, compare each field for changes, and update those fields that have changed, or insert new records. A complication is that I can't just update each field, regardless of whether it's changed or not because there are some triggers that should only be called when a field has actually changed. I can't rewrite these triggers because they're part of a third party solution. My first thought was that I could setup triggers on my scratch database so that when rows are updated, they automatically call a DTS package that updates the other databases. Then, I can still run my program that updates the scratch database, but leave the rest up to SQL Server. Because of my relative unfamiliarity with more complex DTS programming, I'm at a loss to figure out the best way to design this process. Questions that arise are: Can you programmatically tell a DTS package which databases to update? Yes Can you programmatically pass a list of databases to a DTS package and get it to update all of them in the same manner? Yes An alternative solution might be to use a table in my scratch database to list the databases the DTS package should update. Can you update the connection information from within the DTS package based on a field in a table? Yes Can you call a DTS package from a trigger? Yes I assume this is the only way to get SQL Server to execute cross-database queries. Is there a better way to do this? (not sure what exaclty you mean) Any suggestions on how to approach this problem? Any comments on my approach? Thanks. |
![]() |
| Thread Tools | |
| Display Modes | |
| |