dbTalk Databases Forums  

Regarding flat file read

microsoft.public.sqlserver.dts microsoft.public.sqlserver.dts


Discuss Regarding flat file read in the microsoft.public.sqlserver.dts forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Sreenath
 
Posts: n/a

Default Regarding flat file read - 11-17-2004 , 08:19 PM






hi,
I have flat file with atleast 500,000 rows with fixed fields. But, the rows
were divided into 5 different types of rows. First 3 columns are same, 4th
column tells us what type of row it is, in all rows and each row length is
around 80. The remaining columns are different based on the row type.

My question is, Is it better to dump everything into one temp table using
bulk insert task and then write a stored procedure to separate the data into
different tables based on the 4th column ?or Write 5 different datapump tasks
and for each task based on the column type insert into a table.
Looks like the first idea is good but I am worrying about the cursor memory
as it has to hold 500,000 rows of data and each row contains 5 fields.

Thanks in Advance...
Sreenath

Reply With Quote
  #2  
Old   
Allan Mitchell
 
Posts: n/a

Default Re: Regarding flat file read - 11-18-2004 , 12:31 AM






Personal preference would be

BULK INSERT the file as is into a working table
Use n * Datapumps to extract the relevant parts of the data.

--

Allan Mitchell MCSE,MCDBA, (Microsoft SQL Server MVP)
www.SQLDTS.com - The site for all your DTS needs.
www.Konesans.com


"Sreenath" <Sreenath (AT) discussions (DOT) microsoft.com> wrote

Quote:
hi,
I have flat file with atleast 500,000 rows with fixed fields. But, the rows
were divided into 5 different types of rows. First 3 columns are same, 4th
column tells us what type of row it is, in all rows and each row length is
around 80. The remaining columns are different based on the row type.

My question is, Is it better to dump everything into one temp table using
bulk insert task and then write a stored procedure to separate the data into
different tables based on the 4th column ?or Write 5 different datapump tasks
and for each task based on the column type insert into a table.
Looks like the first idea is good but I am worrying about the cursor memory
as it has to hold 500,000 rows of data and each row contains 5 fields.

Thanks in Advance...
Sreenath




Reply With Quote
  #3  
Old   
Blair
 
Posts: n/a

Default Re: Regarding flat file read - 11-18-2004 , 10:41 AM



Your Bulk Insert is a good option.

Another method that'd be able to do it strictly in file-to-file way,
would be 2-steps:
1. Space-pad the entire file to the max(length) of the longest record
type. max(length) would be hardcoded thru disconnected edit. This'd
be done thru a data pump task.
2. Run a data pump task that then filters based on the 4th column.

Doing step 1 prevents rows from being "cannibalized" when the row is
shorter than what DTS was expecting.

-blair


"Sreenath" <Sreenath (AT) discussions (DOT) microsoft.com> wrote

Quote:
hi,
I have flat file with atleast 500,000 rows with fixed fields. But, the rows
were divided into 5 different types of rows. First 3 columns are same, 4th
column tells us what type of row it is, in all rows and each row length is
around 80. The remaining columns are different based on the row type.

My question is, Is it better to dump everything into one temp table using
bulk insert task and then write a stored procedure to separate the data into
different tables based on the 4th column ?or Write 5 different datapump tasks
and for each task based on the column type insert into a table.
Looks like the first idea is good but I am worrying about the cursor memory
as it has to hold 500,000 rows of data and each row contains 5 fields.

Thanks in Advance...
Sreenath

Reply With Quote
  #4  
Old   
Jack Peacock
 
Posts: n/a

Default Re: Regarding flat file read - 11-18-2004 , 06:53 PM



"Sreenath" <Sreenath (AT) discussions (DOT) microsoft.com> wrote

Quote:
hi,
I have flat file with atleast 500,000 rows with fixed fields. But, the
rows
were divided into 5 different types of rows. First 3 columns are same, 4th
column tells us what type of row it is, in all rows and each row length is
around 80. The remaining columns are different based on the row type.

One consideration would be the access time to read the source file. If this
is coming from a relatively slow server over a network then you might be
better off to copy the source into one local table (or file) first, then
split with 3 ActiveX tasks to extract records for each table. Measure how
long it takes to read through the source to get a feel for the thruput rate.
Jack Peacock




Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.