dbTalk Databases Forums  

[BUGS] 7.3.2 incorrectly counts characters for unicode varchar field

mailing.database.pgsql-bugs mailing.database.pgsql-bugs


Discuss [BUGS] 7.3.2 incorrectly counts characters for unicode varchar field in the mailing.database.pgsql-bugs forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Matthew Cooper
 
Posts: n/a

Default [BUGS] 7.3.2 incorrectly counts characters for unicode varchar field - 09-13-2003 , 12:56 PM






================================================== ==========================
POSTGRESQL BUG REPORT TEMPLATE
================================================== ==========================

Your name : Matthew Cooper
Your email address : matty (at) cloverworxs (dot) com

System Configuration
---------------------
Architecture (example: Intel Pentium) : Intel Pentium
Operating System (example: Linux 2.0.26 ELF) : Redhat 8.0 / 9.0
PostgreSQL version (example: PostgreSQL-7.2.2): PostgreSQL-7.2.2 / 7.3.2
Compiler used (example: gcc 2.95.2) : none

Please enter a FULL description of your problem:
------------------------------------------------
I have a database with UNICODE encoding set. In it is a table with a
varchar(10) column. If I insert 10 western characters into it, it is OK. If
I insert 10 chinese characters it says:
postgresql value too long for type character varying 10
when using 7.3.2. If I use 7.2.2 it works fine.
Please describe a way to repeat the problem. Please try to provide a
concise reproducible example, if at all possible:
----------------------------------------------------------------------
createdb -E UNICODE mydb
Then in psql...
create table mgc (c1 varchar(10));
insert into mgc values('0123456789');
This all works fine.
Now I put the following command into a file (say my.sql) which is UTF-8
encoded and the literal is 10 chinese characters.
(I don't know if once emailed this command will be readable so you may have
to re-create the command by pasting 10 chinese characters into your
favourite UTF-8 compatible editor.)
insert into mgc values ('åˆâ€*钟练ä¹ åˆâ€*钟练ä¹ 练ä¹ ');
I then run psql -f my.sql and get the error for 7.3.2 but it works for
7.2.2.

If you know how this problem might be fixed, list the solution below:
---------------------------------------------------------------------
I am guessing it is incorrectly counting the bytes and not the characters.
Presumably a workaround is to double the length of the field.


---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo (AT) postgresql (DOT) org

Reply With Quote
  #2  
Old   
Tom Lane
 
Posts: n/a

Default Re: [BUGS] 7.3.2 incorrectly counts characters for unicode varchar field - 09-15-2003 , 10:11 AM






"Matthew Cooper" <matty (AT) cloverworxs (DOT) com> writes:
Quote:
Attached is the UTF-8 encoded sql file in case it got messed up in the mail
transfer.
Ah, no doubt it did.

This works fine for me, using either 7.3.4 or CVS tip. Are you sure
that the system knows your client-side encoding is supposed to be UTF8?

uc=# show client_encoding ;
client_encoding
-----------------
UNICODE
(1 row)

uc=# create table mgc(f1 varchar(10));
CREATE TABLE
uc=# \i mgc.sql
INSERT 328444 1
uc=# select * from mgc;
f1
----------------------
͈†Õ’ŸÓ£ƒð¿*͈†Õ’ŸÓ£ƒð¿*Ó£ƒð¿*
(1 row)

uc=# select length(f1) from mgc;
length
--------
10
(1 row)


regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org


Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.