dbTalk Databases Forums  

[BUGS] Correct Unicode sorting depends on how initdb was run

mailing.database.pgsql-bugs mailing.database.pgsql-bugs


Discuss [BUGS] Correct Unicode sorting depends on how initdb was run in the mailing.database.pgsql-bugs forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Nils Philippsen
 
Posts: n/a

Default [BUGS] Correct Unicode sorting depends on how initdb was run - 08-11-2003 , 01:42 AM






--=-DwAgyobqZkjFL5QUj2WD
Content-Type: text/plain
Content-Transfer-Encoding: quoted-printable

Hi there,

Recently I stumbled over a very strange problem: I had two very similar
setups (RHL9 with latest updates, pgsql-7.3.2, parameters in "show all"
the same, databases with encoding=3DUNICODE, loaded from the same database
dump) where the sorting on one was erroneous with regards to accented
characters.

After hours of fiddling I found out that the erroneous one was initdb'ed
with locale set to en_US, while the one correctly sorting was initdb'ed
with locale set to en_US.UTF-8. I pg_dumpall'ed the wrong one, redid the
initdb with locale set to en_US.UTF-8 and loaded the dumped databases,
now the sorting order was correct.

Is this expected behaviour (I do not think so)?

Nils
--=20
Nils Philippsen / Red Hat / nphilipp (AT) redhat (DOT) com
"They that can give up essential liberty to obtain a little temporary
safety deserve neither liberty nor safety." -- B. Franklin, 1759
PGP fingerprint: C4A8 9474 5C4C ADE3 2B8F 656D 47D8 9B65 6951 3011

--=-DwAgyobqZkjFL5QUj2WD
Content-Type: application/pgp-signature; name=signature.asc
Content-Description: This is a digitally signed message part

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)

iD8DBQA/NzmxR9ibZWlRMBERAiLNAJ9HMR9PwFz9ACh1yj2mHdXTEP5TZw CfUJac
Y+h251jfe62MycGf7/pxNPU=
=J/8M
-----END PGP SIGNATURE-----

--=-DwAgyobqZkjFL5QUj2WD--


Reply With Quote
  #2  
Old   
Peter Eisentraut
 
Posts: n/a

Default Re: [BUGS] Correct Unicode sorting depends on how initdb was run - 08-11-2003 , 04:04 AM






Nils Philippsen writes:

Quote:
Is this expected behaviour
Yes.

--
Peter Eisentraut peter_e (AT) gmx (DOT) net

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org


Reply With Quote
  #3  
Old   
Nils Philippsen
 
Posts: n/a

Default Re: [BUGS] Correct Unicode sorting depends on how initdb was run - 08-11-2003 , 06:30 AM



--=-RsNh0HVm2yLjTLxNttcB
Content-Type: text/plain
Content-Transfer-Encoding: quoted-printable

On Mon, 2003-08-11 at 10:49, Peter Eisentraut wrote:
Quote:
Nils Philippsen writes:
=20
Is this expected behaviour
=20
Yes.
Hmm. I ask myself whether this is desired behaviour, too.

Given that this isn't obviously documented (at least I didn't find it),
I'd expect sort order to be dependent on server_encoding or
client_encoding, but not on a locale setting that was present at
initialisation of the database structures (and which isn't changeable
except by dump&reload).

Nils
--=20
Nils Philippsen / Red Hat / nphilipp (AT) redhat (DOT) com
"They that can give up essential liberty to obtain a little temporary
safety deserve neither liberty nor safety." -- B. Franklin, 1759
PGP fingerprint: C4A8 9474 5C4C ADE3 2B8F 656D 47D8 9B65 6951 3011

--=-RsNh0HVm2yLjTLxNttcB
Content-Type: application/pgp-signature; name=signature.asc
Content-Description: This is a digitally signed message part

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (GNU/Linux)

iD8DBQA/N3i2R9ibZWlRMBERAmraAJ4upZpwcVHEVtJIvXeM1KHOEIPjNQ Cff1yG
6TR4+E3wdvbkc24JW9dJWQs=
=s/Op
-----END PGP SIGNATURE-----

--=-RsNh0HVm2yLjTLxNttcB--



Reply With Quote
  #4  
Old   
Peter Eisentraut
 
Posts: n/a

Default Re: [BUGS] Correct Unicode sorting depends on how initdb was run - 08-11-2003 , 06:53 AM



Nils Philippsen writes:

Quote:
On Mon, 2003-08-11 at 10:49, Peter Eisentraut wrote:
Nils Philippsen writes:

Is this expected behaviour

Yes.

Hmm. I ask myself whether this is desired behaviour, too.
No, but it will take a lot of work to fix this, such as implementing our
own locale library.

--
Peter Eisentraut peter_e (AT) gmx (DOT) net

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo (AT) postgresql (DOT) org


Reply With Quote
  #5  
Old   
Tom Lane
 
Posts: n/a

Default Re: [BUGS] Correct Unicode sorting depends on how initdb was run - 08-11-2003 , 10:09 AM



Peter Eisentraut <peter_e (AT) gmx (DOT) net> writes:
Quote:
Nils Philippsen writes:
Hmm. I ask myself whether this is desired behaviour, too.

No, but it will take a lot of work to fix this, such as implementing our
own locale library.
We should, however, look into using C99-spec <wctype.h> routines where
available --- the existing logic that depends on <ctype.h> stuff cannot
work with multibyte encodings. I am not sure if this has any
user-visible effects beyond upper()/lower().

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html


Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.