dbTalk Databases Forums  

Importing text data containing fixed width fields

microsoft.public.sqlserver.dts microsoft.public.sqlserver.dts


Discuss Importing text data containing fixed width fields in the microsoft.public.sqlserver.dts forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Amit
 
Posts: n/a

Default Importing text data containing fixed width fields - 12-23-2004 , 09:30 AM






Hello,
I'm trying to import a text file with fixed width fields. I've noticed the
following problem regarding this:

Let's say I have four fields in my text file. When field4 is blank, i.e.
there is a {CR}{LF} after field3 the import process does not mark field4 as
null. Instead it goes to the next line and picks up characters from field1
on that line and fills it in as the value for field4 of the previous line.
It then ignores the rest of the second line and moves to the third line. As
a result, I don't get the second record and my first record is corrupt. This
happens even though I've set the Row Delimiter to {CR}{LF}. Has anyone
encountered this before? Please let me know this wasn't clear and you need
more information.
Thanks,
amit



Reply With Quote
  #2  
Old   
Wm. Scott Miller
 
Posts: n/a

Default Re: Importing text data containing fixed width fields - 12-23-2004 , 12:20 PM






Amit:

Your file is not fixed width if any line is not of the same length as every
other line. "Fixed width" means exactly that. You need to modify your
output of the file so it is truly fixed width. If you can't, you can get
around this, but you will have a little work cut out for you. Have the file
load as one large field (don't tell DTS that is is fixed width and don't let
it cut it up for you). The entire line will now come to you as a single
field value. Cut it up in ActiveX script (use string functions) and set
your fields from the results. If the line is too short, you will have to
detect this and set all remaining fields accordingly.

Of course the easiest solution is to fix your outputting routine to generate
a fixed width file to begin with.

Scott

"Amit" <amitigDELETE (AT) hotmail (DOT) com> wrote

Quote:
Hello,
I'm trying to import a text file with fixed width fields. I've noticed the
following problem regarding this:

Let's say I have four fields in my text file. When field4 is blank, i.e.
there is a {CR}{LF} after field3 the import process does not mark field4
as
null. Instead it goes to the next line and picks up characters from field1
on that line and fills it in as the value for field4 of the previous line.
It then ignores the rest of the second line and moves to the third line.
As
a result, I don't get the second record and my first record is corrupt.
This
happens even though I've set the Row Delimiter to {CR}{LF}. Has anyone
encountered this before? Please let me know this wasn't clear and you need
more information.
Thanks,
amit





Reply With Quote
  #3  
Old   
Amit
 
Posts: n/a

Default Re: Importing text data containing fixed width fields - 12-23-2004 , 05:02 PM



Scott,
That's exactly what we ended up doing. The output from the mainframe is now
such that each line is truly fixed width. Blank values are filled with
spaces.

But it makes me think why the row delimiter option is even there - if it
expects each line to be exactly x characters wide, it already knows where
the new row begins.
Thanks for your reply,
Amit


"Wm. Scott Miller" <Scott.Miller (AT) spamkillerwvinsurance (DOT) gov> wrote

Quote:
Amit:

Your file is not fixed width if any line is not of the same length as
every
other line. "Fixed width" means exactly that. You need to modify your
output of the file so it is truly fixed width. If you can't, you can get
around this, but you will have a little work cut out for you. Have the
file
load as one large field (don't tell DTS that is is fixed width and don't
let
it cut it up for you). The entire line will now come to you as a single
field value. Cut it up in ActiveX script (use string functions) and set
your fields from the results. If the line is too short, you will have to
detect this and set all remaining fields accordingly.

Of course the easiest solution is to fix your outputting routine to
generate
a fixed width file to begin with.

Scott

"Amit" <amitigDELETE (AT) hotmail (DOT) com> wrote in message
news:esVuZRQ6EHA.4040 (AT) TK2MSFTNGP14 (DOT) phx.gbl...
Hello,
I'm trying to import a text file with fixed width fields. I've noticed
the
following problem regarding this:

Let's say I have four fields in my text file. When field4 is blank, i.e.
there is a {CR}{LF} after field3 the import process does not mark field4
as
null. Instead it goes to the next line and picks up characters from
field1
on that line and fills it in as the value for field4 of the previous
line.
It then ignores the rest of the second line and moves to the third line.
As
a result, I don't get the second record and my first record is corrupt.
This
happens even though I've set the Row Delimiter to {CR}{LF}. Has anyone
encountered this before? Please let me know this wasn't clear and you
need
more information.
Thanks,
amit







Reply With Quote
  #4  
Old   
Wm. Scott Miller
 
Posts: n/a

Default Re: Importing text data containing fixed width fields - 12-27-2004 , 09:21 AM



Amit:

There are two different implementations of fixed width that I've seen. The
first is where the routine skips row delimiters (or includes them as
characters in the row) until it has enough characters to make the row the
right length. Once the right length is accomplished it searches for a row
delimiter (this is what SQL DTS uses). The second routine will fill in the
row to the right length upon encountering a row delimiter (which is what you
expected). Unfortunately, if you have a malformed fixed width file either
method can make life difficult. E.G. What happens to both routines if the
first record in the file is malformed? The routine can end up with an
incorrect row width. Additionally, the routine you are using can give
unexpected results if you use special characters (character with ASCII code
0--null).

One thing you must also keep in mind is that some manufacturers (because of
the original definition of a fixed with file and as a carry over from the
days when disk space was expensive and efficient usage of disk space was
mandatory) keep the fixed width files *without* row delimiters at all (it
saves two characters per row and when you have millions of rows, that adds
up). We encountered files like this when we made our conversion from the
mainframe to our internal system. Matter of fact, our routine consisted of:
download the file from the mainframe, EBCDIC to ASCII conversion, inserting
row delimiters at X characters (at the same time replace null characters
with spaces), upload to new database, run conversion routines to new schema.

When working with fixed width files, remember that row delimiters were added
to the fixed width file format later and that the original format had no
such row delimiter and length was what decided where one record ended and
the next one began because disk space was at a premium when this file format
was first used.

Scott

"Amit" <amitigDELETE (AT) hotmail (DOT) com> wrote

Quote:
Scott,
That's exactly what we ended up doing. The output from the mainframe is
now
such that each line is truly fixed width. Blank values are filled with
spaces.

But it makes me think why the row delimiter option is even there - if it
expects each line to be exactly x characters wide, it already knows where
the new row begins.
Thanks for your reply,
Amit


"Wm. Scott Miller" <Scott.Miller (AT) spamkillerwvinsurance (DOT) gov> wrote in
message
news:eeMJmwR6EHA.1524 (AT) TK2MSFTNGP09 (DOT) phx.gbl...
Amit:

Your file is not fixed width if any line is not of the same length as
every
other line. "Fixed width" means exactly that. You need to modify your
output of the file so it is truly fixed width. If you can't, you can get
around this, but you will have a little work cut out for you. Have the
file
load as one large field (don't tell DTS that is is fixed width and don't
let
it cut it up for you). The entire line will now come to you as a single
field value. Cut it up in ActiveX script (use string functions) and set
your fields from the results. If the line is too short, you will have to
detect this and set all remaining fields accordingly.

Of course the easiest solution is to fix your outputting routine to
generate
a fixed width file to begin with.

Scott

"Amit" <amitigDELETE (AT) hotmail (DOT) com> wrote in message
news:esVuZRQ6EHA.4040 (AT) TK2MSFTNGP14 (DOT) phx.gbl...
Hello,
I'm trying to import a text file with fixed width fields. I've noticed
the
following problem regarding this:

Let's say I have four fields in my text file. When field4 is blank,
i.e.
there is a {CR}{LF} after field3 the import process does not mark
field4
as
null. Instead it goes to the next line and picks up characters from
field1
on that line and fills it in as the value for field4 of the previous
line.
It then ignores the rest of the second line and moves to the third
line.
As
a result, I don't get the second record and my first record is corrupt.
This
happens even though I've set the Row Delimiter to {CR}{LF}. Has anyone
encountered this before? Please let me know this wasn't clear and you
need
more information.
Thanks,
amit









Reply With Quote
  #5  
Old   
Amit
 
Posts: n/a

Default Re: Importing text data containing fixed width fields - 12-27-2004 , 10:54 AM



Ah, that makes sense. Thanks for your reply,
amit


"Wm. Scott Miller" <Scott.Miller (AT) spamkillerwvinsurance (DOT) gov> wrote

Quote:
Amit:

There are two different implementations of fixed width that I've seen.
The
first is where the routine skips row delimiters (or includes them as
characters in the row) until it has enough characters to make the row the
right length. Once the right length is accomplished it searches for a row
delimiter (this is what SQL DTS uses). The second routine will fill in
the
row to the right length upon encountering a row delimiter (which is what
you
expected). Unfortunately, if you have a malformed fixed width file either
method can make life difficult. E.G. What happens to both routines if the
first record in the file is malformed? The routine can end up with an
incorrect row width. Additionally, the routine you are using can give
unexpected results if you use special characters (character with ASCII
code
0--null).

One thing you must also keep in mind is that some manufacturers (because
of
the original definition of a fixed with file and as a carry over from the
days when disk space was expensive and efficient usage of disk space was
mandatory) keep the fixed width files *without* row delimiters at all (it
saves two characters per row and when you have millions of rows, that adds
up). We encountered files like this when we made our conversion from the
mainframe to our internal system. Matter of fact, our routine consisted
of:
download the file from the mainframe, EBCDIC to ASCII conversion,
inserting
row delimiters at X characters (at the same time replace null characters
with spaces), upload to new database, run conversion routines to new
schema.

When working with fixed width files, remember that row delimiters were
added
to the fixed width file format later and that the original format had no
such row delimiter and length was what decided where one record ended and
the next one began because disk space was at a premium when this file
format
was first used.

Scott

"Amit" <amitigDELETE (AT) hotmail (DOT) com> wrote in message
news:%23JTxtNU6EHA.2552 (AT) TK2MSFTNGP09 (DOT) phx.gbl...
Scott,
That's exactly what we ended up doing. The output from the mainframe is
now
such that each line is truly fixed width. Blank values are filled with
spaces.

But it makes me think why the row delimiter option is even there - if it
expects each line to be exactly x characters wide, it already knows
where
the new row begins.
Thanks for your reply,
Amit


"Wm. Scott Miller" <Scott.Miller (AT) spamkillerwvinsurance (DOT) gov> wrote in
message
news:eeMJmwR6EHA.1524 (AT) TK2MSFTNGP09 (DOT) phx.gbl...
Amit:

Your file is not fixed width if any line is not of the same length as
every
other line. "Fixed width" means exactly that. You need to modify your
output of the file so it is truly fixed width. If you can't, you can
get
around this, but you will have a little work cut out for you. Have the
file
load as one large field (don't tell DTS that is is fixed width and
don't
let
it cut it up for you). The entire line will now come to you as a
single
field value. Cut it up in ActiveX script (use string functions) and
set
your fields from the results. If the line is too short, you will have
to
detect this and set all remaining fields accordingly.

Of course the easiest solution is to fix your outputting routine to
generate
a fixed width file to begin with.

Scott

"Amit" <amitigDELETE (AT) hotmail (DOT) com> wrote in message
news:esVuZRQ6EHA.4040 (AT) TK2MSFTNGP14 (DOT) phx.gbl...
Hello,
I'm trying to import a text file with fixed width fields. I've
noticed
the
following problem regarding this:

Let's say I have four fields in my text file. When field4 is blank,
i.e.
there is a {CR}{LF} after field3 the import process does not mark
field4
as
null. Instead it goes to the next line and picks up characters from
field1
on that line and fills it in as the value for field4 of the previous
line.
It then ignores the rest of the second line and moves to the third
line.
As
a result, I don't get the second record and my first record is
corrupt.
This
happens even though I've set the Row Delimiter to {CR}{LF}. Has
anyone
encountered this before? Please let me know this wasn't clear and you
need
more information.
Thanks,
amit











Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.