dbTalk Databases Forums  

Unicode, UTF8, and Latin9

comp.databases.postgresql comp.databases.postgresql


Discuss Unicode, UTF8, and Latin9 in the comp.databases.postgresql forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
genkuro@gmail.com
 
Posts: n/a

Default Unicode, UTF8, and Latin9 - 04-30-2007 , 10:51 AM






A Tomcat frontend is logging the following error. The backend is
Postgres 8.1.
< ERROR: character 0xc582 of encoding "UTF8" has no equivalent in
"LATIN9" >

I understand this message to say that the database supports LATIN9
(probably a parameter specified at creation time) and some user put in
a UTF-8 character which failed to translate.

Can I define the database to support /unicode/ ... that is, the entire
character space? Would that be UTF-16?

I understand I have to scrap the current database to do that. I have
to export data. Get rid of the old database. Create a new database
of the same name and proper encoding. And import data. Is this
correct?

Thanks,
Brian


Reply With Quote
  #2  
Old   
jpd
 
Posts: n/a

Default Re: Unicode, UTF8, and Latin9 - 04-30-2007 , 11:03 AM






Begin <1177948268.851413.150620 (AT) n59g2000hsh (DOT) googlegroups.com>
On 2007-04-30, genkuro (AT) gmail (DOT) com <genkuro (AT) gmail (DOT) com> wrote:
Quote:
Can I define the database to support /unicode/ ... that is, the entire
character space? Would that be UTF-16?
UTF-8 and UTF-16 both are encodings that support the entire unicode
character space. UTF-8 uses one to four (origninally one to six) bytes
per unicode code point, and UTF-16 uses either two or four bytes for
each code point. Many implementations of UTF-16 only support the
two-byte encodings and so don't support the more exotic code points.

You can ask the database which encoding it uses to store data, and you
can set it up to use UTF-8. This may be different from the encoding used
for client connections. You'd best first investigate what encodings are
used for what, exactly.


--
j p d (at) d s b (dot) t u d e l f t (dot) n l .
This message was originally posted on Usenet in plain text.
Any other representation, additions, or changes do not have my
consent and may be a violation of international copyright law.


Reply With Quote
  #3  
Old   
Laurenz Albe
 
Posts: n/a

Default Re: Unicode, UTF8, and Latin9 - 05-02-2007 , 03:03 AM



genkuro (AT) gmail (DOT) com wrote:
Quote:
A Tomcat frontend is logging the following error. The backend is
Postgres 8.1.
ERROR: character 0xc582 of encoding "UTF8" has no equivalent in
"LATIN9"

I understand this message to say that the database supports LATIN9
(probably a parameter specified at creation time) and some user put in
a UTF-8 character which failed to translate.
It could mean two things:
First, that your database is UTF8, your client uses
LATIN9, and you are trying to retrieve a character (Polish lower case l
with stroke; the unpronouncable) from the database that does not exist in
LATIN9 and consequently cannot be converted.

Second, it could be that your database is LATIN9 and your client is
configured to use UTF8, and you are trying to store a Polish lower case l
with stroke in the database.

To find out which of the two is the case, run the following two
commands:
SHOW server_encoding;
SHOW client_encoding;

In the first case, the solution is simple:
Change the client encoding to something that contains the Polish letter
in question, e.g. LATIN2 or WIN1250.

In the second case, if you need to store this character, you'll have to
create a new database with an encoding that contains the character.

Quote:
Can I define the database to support /unicode/ ... that is, the entire
character space? Would that be UTF-16?
As has been explained, UTF-8 can encode every UNICODE character.

Quote:
I understand I have to scrap the current database to do that. I have
to export data. Get rid of the old database. Create a new database
of the same name and proper encoding. And import data. Is this
correct?
Yes, in the second case mentioned above.

Yours,
Laurenz Albe


Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.