dbTalk Databases Forums  

Issues while storing a long dash in oracle

comp.databases.oracle.misc comp.databases.oracle.misc


Discuss Issues while storing a long dash in oracle in the comp.databases.oracle.misc forum.



Reply
 
Thread Tools Display Modes
  #21  
Old   
Thomas Kellerer
 
Posts: n/a

Default Re: Issues while storing a long dash in oracle - 03-11-2008 , 02:35 AM






Sandy80, 11.03.2008 09:24:
Quote:
But my problem is that even if that character is processed and stored
correctly in the database it would still be stored as a 2 byte
character. In that case how do I make the following command, which
works on the basis of no. of bytes, work correctly:

nvl(rpad(UPPER(substr(nvl( substr(column1,1,instr(column1,'(')-1),column1),
1,24)),24),' ')



I don't see why the statement shouldn't work. substr and instr work on *characters* not bytes:

http://download-uk.oracle.com/docs/c...162.htm#i87066
"SUBSTR calculates lengths using characters as defined by the input character set"

http://download-uk.oracle.com/docs/c...068.htm#i77598
"INSTR calculates strings using characters as defined by the input character set"

So why do you think your statement doesn't work?

Thomas



Reply With Quote
  #22  
Old   
Sandy80
 
Posts: n/a

Default Re: Issues while storing a long dash in oracle - 03-11-2008 , 03:11 AM






Not sure as to why is it not working but when I run this command on
the data "Finance ¿ Fulfillment Factory" the result returned is:

"FINANCE ¿ FULFILLMENT F" whereas the correct result that I am looking
for is "FINANCE ¿ FULFILLMENT FA".

Any ideas?


Reply With Quote
  #23  
Old   
Sandy80
 
Posts: n/a

Default Re: Issues while storing a long dash in oracle - 03-11-2008 , 03:11 AM



Not sure as to why is it not working but when I run this command on
the data "Finance ¿ Fulfillment Factory" the result returned is:

"FINANCE ¿ FULFILLMENT F" whereas the correct result that I am looking
for is "FINANCE ¿ FULFILLMENT FA".

Any ideas?


Reply With Quote
  #24  
Old   
Sandy80
 
Posts: n/a

Default Re: Issues while storing a long dash in oracle - 03-11-2008 , 03:11 AM



Not sure as to why is it not working but when I run this command on
the data "Finance ¿ Fulfillment Factory" the result returned is:

"FINANCE ¿ FULFILLMENT F" whereas the correct result that I am looking
for is "FINANCE ¿ FULFILLMENT FA".

Any ideas?


Reply With Quote
  #25  
Old   
Sandy80
 
Posts: n/a

Default Re: Issues while storing a long dash in oracle - 03-11-2008 , 03:11 AM



Not sure as to why is it not working but when I run this command on
the data "Finance ¿ Fulfillment Factory" the result returned is:

"FINANCE ¿ FULFILLMENT F" whereas the correct result that I am looking
for is "FINANCE ¿ FULFILLMENT FA".

Any ideas?


Reply With Quote
  #26  
Old   
Laurenz Albe
 
Posts: n/a

Default Re: Issues while storing a long dash in oracle - 03-11-2008 , 04:02 AM



Sandy80 <svarshneymail (AT) gmail (DOT) com> wrote:
Quote:
nvl(rpad(UPPER(substr(nvl( substr(column1,1,instr(column1,'(')-1),column1),
1,24)),24),' ')

Not sure as to why is it not working but when I run this command on
the data "Finance ¿ Fulfillment Factory" the result returned is:

"FINANCE ¿ FULFILLMENT F" whereas the correct result that I am looking
for is "FINANCE ¿ FULFILLMENT FA".
It works fine on a 10g database here.

Most likely the encoding of the data does not match the client character
set.

You must find out the encoding of your text data.
Ask the people who produced the data, or examine the file and guess.

You could for example use "od -c". Here are two samples:

0000000 F i n a n c e 342 200 224 F u l f
0000020 i l l m e n t F a c t o r y \n
0000040

This is in UTF-8 encoding. The em-dash (UNICODE 0x2014) is encoded as three
bytes in UTF-8. They are represented in octal in the "od" output.

0000000 F i n a n c e 227 F u l f i l
0000020 l m e n t F a c t o r y
0000035

This is WIN-1252 encoding. The em-dash is a single byte here.

Next step is to set the client character set correctly.

For the first case, NLS_LANG should be AMERICAN_AMERICA.AL32UTF8, and
for the second case it should be AMERICAN_AMERICA.WE8MSWIN1252
(the AMERICAN_AMERICA is not important - choose what you want).

If data and client character set match, SUBSTR will work and sqlldr should
too.

Yours,
Laurenz Albe


Reply With Quote
  #27  
Old   
Laurenz Albe
 
Posts: n/a

Default Re: Issues while storing a long dash in oracle - 03-11-2008 , 04:02 AM



Sandy80 <svarshneymail (AT) gmail (DOT) com> wrote:
Quote:
nvl(rpad(UPPER(substr(nvl( substr(column1,1,instr(column1,'(')-1),column1),
1,24)),24),' ')

Not sure as to why is it not working but when I run this command on
the data "Finance ¿ Fulfillment Factory" the result returned is:

"FINANCE ¿ FULFILLMENT F" whereas the correct result that I am looking
for is "FINANCE ¿ FULFILLMENT FA".
It works fine on a 10g database here.

Most likely the encoding of the data does not match the client character
set.

You must find out the encoding of your text data.
Ask the people who produced the data, or examine the file and guess.

You could for example use "od -c". Here are two samples:

0000000 F i n a n c e 342 200 224 F u l f
0000020 i l l m e n t F a c t o r y \n
0000040

This is in UTF-8 encoding. The em-dash (UNICODE 0x2014) is encoded as three
bytes in UTF-8. They are represented in octal in the "od" output.

0000000 F i n a n c e 227 F u l f i l
0000020 l m e n t F a c t o r y
0000035

This is WIN-1252 encoding. The em-dash is a single byte here.

Next step is to set the client character set correctly.

For the first case, NLS_LANG should be AMERICAN_AMERICA.AL32UTF8, and
for the second case it should be AMERICAN_AMERICA.WE8MSWIN1252
(the AMERICAN_AMERICA is not important - choose what you want).

If data and client character set match, SUBSTR will work and sqlldr should
too.

Yours,
Laurenz Albe


Reply With Quote
  #28  
Old   
Laurenz Albe
 
Posts: n/a

Default Re: Issues while storing a long dash in oracle - 03-11-2008 , 04:02 AM



Sandy80 <svarshneymail (AT) gmail (DOT) com> wrote:
Quote:
nvl(rpad(UPPER(substr(nvl( substr(column1,1,instr(column1,'(')-1),column1),
1,24)),24),' ')

Not sure as to why is it not working but when I run this command on
the data "Finance ¿ Fulfillment Factory" the result returned is:

"FINANCE ¿ FULFILLMENT F" whereas the correct result that I am looking
for is "FINANCE ¿ FULFILLMENT FA".
It works fine on a 10g database here.

Most likely the encoding of the data does not match the client character
set.

You must find out the encoding of your text data.
Ask the people who produced the data, or examine the file and guess.

You could for example use "od -c". Here are two samples:

0000000 F i n a n c e 342 200 224 F u l f
0000020 i l l m e n t F a c t o r y \n
0000040

This is in UTF-8 encoding. The em-dash (UNICODE 0x2014) is encoded as three
bytes in UTF-8. They are represented in octal in the "od" output.

0000000 F i n a n c e 227 F u l f i l
0000020 l m e n t F a c t o r y
0000035

This is WIN-1252 encoding. The em-dash is a single byte here.

Next step is to set the client character set correctly.

For the first case, NLS_LANG should be AMERICAN_AMERICA.AL32UTF8, and
for the second case it should be AMERICAN_AMERICA.WE8MSWIN1252
(the AMERICAN_AMERICA is not important - choose what you want).

If data and client character set match, SUBSTR will work and sqlldr should
too.

Yours,
Laurenz Albe


Reply With Quote
  #29  
Old   
Laurenz Albe
 
Posts: n/a

Default Re: Issues while storing a long dash in oracle - 03-11-2008 , 04:02 AM



Sandy80 <svarshneymail (AT) gmail (DOT) com> wrote:
Quote:
nvl(rpad(UPPER(substr(nvl( substr(column1,1,instr(column1,'(')-1),column1),
1,24)),24),' ')

Not sure as to why is it not working but when I run this command on
the data "Finance ¿ Fulfillment Factory" the result returned is:

"FINANCE ¿ FULFILLMENT F" whereas the correct result that I am looking
for is "FINANCE ¿ FULFILLMENT FA".
It works fine on a 10g database here.

Most likely the encoding of the data does not match the client character
set.

You must find out the encoding of your text data.
Ask the people who produced the data, or examine the file and guess.

You could for example use "od -c". Here are two samples:

0000000 F i n a n c e 342 200 224 F u l f
0000020 i l l m e n t F a c t o r y \n
0000040

This is in UTF-8 encoding. The em-dash (UNICODE 0x2014) is encoded as three
bytes in UTF-8. They are represented in octal in the "od" output.

0000000 F i n a n c e 227 F u l f i l
0000020 l m e n t F a c t o r y
0000035

This is WIN-1252 encoding. The em-dash is a single byte here.

Next step is to set the client character set correctly.

For the first case, NLS_LANG should be AMERICAN_AMERICA.AL32UTF8, and
for the second case it should be AMERICAN_AMERICA.WE8MSWIN1252
(the AMERICAN_AMERICA is not important - choose what you want).

If data and client character set match, SUBSTR will work and sqlldr should
too.

Yours,
Laurenz Albe


Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.