![]() | |
![]() |
| | Thread Tools | Display Modes |
#1
| |||
| |||
|
#2
| |||
| |||
|
#3
| |||
| |||
|
|
There seems to be a big problem with Unicode for which a solution might already exist. Somebody had the following problem on another mailing list. My suggestion is at the bottom of this message but if another solution already exists I'd like to hear about it. The problem is that special characters aren't treated right under Unicode. Here are a few examples: 1. "UPPER('é')" doesn't work. |
|
2. "ORDER BY mycolumn" gives a wrong sort order. Uppercase ASCII characters come first, then lowercase ASCII, then accented characters... This really isn't what a human would like to see. |
#4
| |||
| |||
|
|
2. "ORDER BY mycolumn" gives a wrong sort order. Uppercase ASCII characters come first, then lowercase ASCII, then accented characters... This really isn't what a human would like to see. This is driven by locale, what LC_COLLATE value was the database created with (if you don't know then pg_controldata should give that to you)? It sounds like the locale is "C" locale which means sort by byte value or perhaps the locale is one that isn't for the correct encoding. |
#5
| |||
| |||
|
|
2. "ORDER BY mycolumn" gives a wrong sort order. Uppercase ASCII characters come first, then lowercase ASCII, then accented characters... This really isn't what a human would like to see. This is driven by locale, what LC_COLLATE value was the database created with (if you don't know then pg_controldata should give that to you)? It sounds like the locale is "C" locale which means sort by byte value or perhaps the locale is one that isn't for the correct encoding. I've found this: http://www.postgresql.org/docs/7.4/i...et.html#LOCALE "locale -a" isn't recognized on OS X. How else can I find the possible locales? |
|
And how can I do an initdb so that sorting on Unicode will work for French, Greek, Japanase, etc. users of a single database? |
#6
| |||
| |||
|
|
IIRC, right now upper and lower only work correctly in single byte encodings. I think when full sql collation and character set behavior is done this problem will go away. |
#7
| |||
| |||
|
|
And how can I do an initdb so that sorting on Unicode will work for French, Greek, Japanase, etc. users of a single database? AFAIK, you can't really at this time. With an appropriately crafted locale, you could probably get reasonably close, but I've never actually tried to work with creating one so I don't know what's involved. And, if two languages had different rules for two characters you'd not be supporting both. |
#8
| |||
| |||
|
|
The erorr throw is: pg_dump: message type 0x44 arrived from server while idle pg_dump: dumpClasses(): SQL command failed pg_dump: Error message from server: server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. pg_dump: The command was: FETCH 100 FROM _pg_dump_cursor |
#9
| |||
| |||
|
|
"locale -a" isn't recognized on OS X. How else can I find the possible locales? |
|
And how can I do an initdb so that sorting on Unicode will work for French, Greek, Japanase, etc. users of a single database? |
#10
| |||
| |||
|
|
And how can I do an initdb so that sorting on Unicode will work for French, Greek, Japanase, etc. users of a single database? AFAIK, you can't really at this time. With an appropriately crafted locale, you could probably get reasonably close, but I've never actually tried to work with creating one so I don't know what's involved. And, if two languages had different rules for two characters you'd not be supporting both. Thanks Stephan! I've found my list of locales. It's a pity only one language can be used at a time but as you say there are conflicting rules anyway. The docs say there is a speed penalty on using locales. Does anyone have any idea on how severe this is? I'm wondering wether I should |
|
use the translate() function after all because of this. It would solve multilingual issues to a certain level and there wouldn't be a speed penalty since the indexes would be build on the translate() function too. |
![]() |
| Thread Tools | |
| Display Modes | |
| |