dbTalk Databases Forums  

SSIS - check for duplicate file in folder??

microsoft.public.sqlserver.dts microsoft.public.sqlserver.dts


Discuss SSIS - check for duplicate file in folder?? in the microsoft.public.sqlserver.dts forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
unc27932@yahoo.com
 
Posts: n/a

Default SSIS - check for duplicate file in folder?? - 03-06-2006 , 05:30 PM






I've got two folders - Folder1 contains files that I want to process.
Folder2 contains files that I've processed in the past. I want to
compare the filenames as I go through and see if they're already in
folder2 before I process folder1. I've got it working, but it's
painfully slow (I've got about 325 files in Folder2). I'm using a
nested ForEach container (outer container is folder 1, inner container
is folder 2). Each of them has a variable mapped to the filename, then
in a script task, I compare the filename & move the file if it's
already been processed. It takes 3-5 minutes to run through the 300+
filenames on this network drive. What would be a better way to do
this???


Reply With Quote
  #2  
Old   
Paul Smith
 
Posts: n/a

Default Re: SSIS - check for duplicate file in folder?? - 03-07-2006 , 01:33 AM






I just had to do one of these for a migration, it runs quickly with 2000+
files, this is the approach I took

For Each File
script task that checks for existance in target and sets a Variable to 1
or 0
File Task that does the copy, constraint is success && Variable = 1

Paul
<unc27932 (AT) yahoo (DOT) com> wrote

Quote:
I've got two folders - Folder1 contains files that I want to process.
Folder2 contains files that I've processed in the past. I want to
compare the filenames as I go through and see if they're already in
folder2 before I process folder1. I've got it working, but it's
painfully slow (I've got about 325 files in Folder2). I'm using a
nested ForEach container (outer container is folder 1, inner container
is folder 2). Each of them has a variable mapped to the filename, then
in a script task, I compare the filename & move the file if it's
already been processed. It takes 3-5 minutes to run through the 300+
filenames on this network drive. What would be a better way to do
this???




Reply With Quote
  #3  
Old   
alanwo@gmail.com
 
Posts: n/a

Default Re: SSIS - check for duplicate file in folder?? - 03-07-2006 , 01:33 AM



have you tried NoClone? it uncovers duplicate files by true
byte-by-byte comparison, compares contents not just file name.
http://noclone.net


Reply With Quote
  #4  
Old   
unc27932@yahoo.com
 
Posts: n/a

Default Re: SSIS - check for duplicate file in folder?? - 03-07-2006 , 07:11 AM



Paul - what script code did you use to do..."script task that checks
for existance in target"? I'm way more familiar with SQL than VBscript
or other scripting languages.


Paul Smith wrote:
Quote:
I just had to do one of these for a migration, it runs quickly with 2000+
files, this is the approach I took

For Each File
script task that checks for existance in target and sets a Variable to 1
or 0
File Task that does the copy, constraint is success && Variable = 1

Paul
unc27932 (AT) yahoo (DOT) com> wrote in message
news:1141687857.242311.174130 (AT) p10g2000cwp (DOT) googlegroups.com...
I've got two folders - Folder1 contains files that I want to process.
Folder2 contains files that I've processed in the past. I want to
compare the filenames as I go through and see if they're already in
folder2 before I process folder1. I've got it working, but it's
painfully slow (I've got about 325 files in Folder2). I'm using a
nested ForEach container (outer container is folder 1, inner container
is folder 2). Each of them has a variable mapped to the filename, then
in a script task, I compare the filename & move the file if it's
already been processed. It takes 3-5 minutes to run through the 300+
filenames on this network drive. What would be a better way to do
this???



Reply With Quote
  #5  
Old   
Paul Smith
 
Posts: n/a

Default Re: SSIS - check for duplicate file in folder?? - 03-08-2006 , 01:22 AM



I used the script task. Send me an EMail and I will give you the code

Paul
<unc27932 (AT) yahoo (DOT) com> wrote

Quote:
Paul - what script code did you use to do..."script task that checks
for existance in target"? I'm way more familiar with SQL than VBscript
or other scripting languages.


Paul Smith wrote:
I just had to do one of these for a migration, it runs quickly with 2000+
files, this is the approach I took

For Each File
script task that checks for existance in target and sets a Variable
to 1
or 0
File Task that does the copy, constraint is success && Variable = 1

Paul
unc27932 (AT) yahoo (DOT) com> wrote in message
news:1141687857.242311.174130 (AT) p10g2000cwp (DOT) googlegroups.com...
I've got two folders - Folder1 contains files that I want to process.
Folder2 contains files that I've processed in the past. I want to
compare the filenames as I go through and see if they're already in
folder2 before I process folder1. I've got it working, but it's
painfully slow (I've got about 325 files in Folder2). I'm using a
nested ForEach container (outer container is folder 1, inner container
is folder 2). Each of them has a variable mapped to the filename, then
in a script task, I compare the filename & move the file if it's
already been processed. It takes 3-5 minutes to run through the 300+
filenames on this network drive. What would be a better way to do
this???





Reply With Quote
  #6  
Old   
unc27932@yahoo.com
 
Posts: n/a

Default Re: SSIS - check for duplicate file in folder?? - 03-08-2006 , 07:34 AM



I think I figured it out - using the filesystemobject

Dim fso
Dim Filename As String
Dim FullFilePath As String

'Get Filename from variable
Filename = Right(Dts.Variables("FileName").Value.ToString(),
13)
FullFilePath = Dts.Variables("ProcessedPath").Value.ToString()
& Filename

'Create FSO
fso = CreateObject("Scripting.FileSystemObject")

'Check to see if fileexists
If (fso.FileExists(FullFilePath)) Then
Dts.TaskResult = Dts.Results.Success
Else
Dts.TaskResult = Dts.Results.Failure
End If


Paul Smith wrote:
Quote:
I used the script task. Send me an EMail and I will give you the code

Paul
unc27932 (AT) yahoo (DOT) com> wrote in message
news:1141737061.580813.303520 (AT) p10g2000cwp (DOT) googlegroups.com...
Paul - what script code did you use to do..."script task that checks
for existance in target"? I'm way more familiar with SQL than VBscript
or other scripting languages.


Paul Smith wrote:
I just had to do one of these for a migration, it runs quickly with 2000+
files, this is the approach I took

For Each File
script task that checks for existance in target and sets a Variable
to 1
or 0
File Task that does the copy, constraint is success && Variable = 1

Paul
unc27932 (AT) yahoo (DOT) com> wrote in message
news:1141687857.242311.174130 (AT) p10g2000cwp (DOT) googlegroups.com...
I've got two folders - Folder1 contains files that I want to process.
Folder2 contains files that I've processed in the past. I want to
compare the filenames as I go through and see if they're already in
folder2 before I process folder1. I've got it working, but it's
painfully slow (I've got about 325 files in Folder2). I'm using a
nested ForEach container (outer container is folder 1, inner container
is folder 2). Each of them has a variable mapped to the filename, then
in a script task, I compare the filename & move the file if it's
already been processed. It takes 3-5 minutes to run through the 300+
filenames on this network drive. What would be a better way to do
this???




Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.