![]() | |
#1
| |||
| |||
|
#2
| |||
| |||
|
|
I've got some big flat files that need to go into a relational database. Some heavy scrubbing needs to be done for removal of duplicates, generation of keys, transformation of data values. It looks like I will be able to process some of the data in parallel to speed things up a bit. I'm wondering for the scrubbing process do I use DTS parallel processing calling stored procs to do the dirty work or I was thinking about writing a C# based multi-threaded app that would read in the data one end into datasets, do the scrubbing and then write out the clean data at the other end. I'm not sure about this approach though because of the size of the files, it might prove too costly to read the data into memory and also I've never scrubbed in C# before. Does anyone have any thoughts on the approach that I should take or any experience to share? |
![]() |
| Thread Tools | |
| Display Modes | |
| |