dbTalk Databases Forums  

[BUGS] BUG #2261: ILIKE seems to be buggy on koi8 input

mailing.database.pgsql-bugs mailing.database.pgsql-bugs


Discuss [BUGS] BUG #2261: ILIKE seems to be buggy on koi8 input in the mailing.database.pgsql-bugs forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Evgeny Gridasov
 
Posts: n/a

Default [BUGS] BUG #2261: ILIKE seems to be buggy on koi8 input - 02-15-2006 , 10:53 AM







The following bug has been logged online:

Bug reference: 2261
Logged by: Evgeny Gridasov
Email address: eugrid (AT) fpm (DOT) kubsu.ru
PostgreSQL version: 8.1.2
Operating system: Debian Linux
Description: ILIKE seems to be buggy on koi8 input
Details:

my terminal is RU_ru.KOI8-R,
template1's encoding is UTF8.
ILIKE seems to be buggy when comparing russian strings,
while UPPER/LOWER works OK.

template1=# \encoding koi8;

try to get uppercase of some russian letters:
template1=# select upper('фыва');
upper
-------
ФЫВА
(1 row)

result is OK!

next, try to compare uppercase and lowercase using
ILIKE:
template1=# select true where 'фыва' ilike 'ФЫВА';
bool
------
(0 rows)

OOPS! Nothing happened. But why?

try the same but with latin charset letters:

template1=# select true where 'asdf' ilike 'ASDF';
bool
------
t
(1 row)

Try to compare lowercase with lowercase (russian):

template1=# select true where 'фыва' ilike 'фыва';
bool
------
t
(1 row)

it works.

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Reply With Quote
  #2  
Old   
Tom Lane
 
Posts: n/a

Default Re: [BUGS] BUG #2261: ILIKE seems to be buggy on koi8 input - 02-15-2006 , 11:44 AM






"Evgeny Gridasov" <eugrid (AT) fpm (DOT) kubsu.ru> writes:
Quote:
my terminal is RU_ru.KOI8-R,
template1's encoding is UTF8.
ILIKE seems to be buggy when comparing russian strings,
while UPPER/LOWER works OK.
I'll bet that the database's locale setting is expecting some encoding
other than UTF8 :-(. You need to have compatible locale and encoding
settings inside the database. You didn't say exactly what the database
LC_COLLATE value is, but if it's RU_ru.KOI8-R, that definitely does not
match UTF8.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq


Reply With Quote
  #3  
Old   
Tom Lane
 
Posts: n/a

Default Re: [BUGS] BUG #2261: ILIKE seems to be buggy on koi8 input - 02-20-2006 , 04:06 PM



Evgeny Gridasov <eugrid (AT) fpm (DOT) kubsu.ru> writes:
Quote:
postgresql server starts with environment:
LC_COLLATE=en_US.UTF-8
LC_ALL=en_US.UTF-8
LANG=en_US.UTF-8
Well, that setting shouldn't translate much except A-Z/a-z. If you want
cyrillic upper/lower case conversions you need database's LC_CTYPE to be
ru_RU.something.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend


Reply With Quote
  #4  
Old   
Evgeny Gridasov
 
Posts: n/a

Default Re: [BUGS] BUG #2261: ILIKE seems to be buggy on koi8 input - 02-21-2006 , 06:47 AM



postgresql server starts with environment:

LC_COLLATE=en_US.UTF-8
LC_ALL=en_US.UTF-8
LANG=en_US.UTF-8

I've tried to set different LC_COLLATE/LC_ALL/LANG settings
but it did not help.

I've tried to change my psql input to unicode russian, but it did not help, too.

'show all' says I've got lc_collate and other lc_* set to en_US.UTF-8.
initdb was run with this locale.
It cannot be modified setting it in postgresql.conf (creation db constant?)
Should I reinit database to get this working or what?
If I should reinit db, what locale should I choose?

BTW, ~* syntax does not also work with upper/lower case russian letters,
while upper()/lower() still work ok.

On Wed, 15 Feb 2006 12:44:18 -0500
Tom Lane <tgl (AT) sss (DOT) pgh.pa.us> wrote:

Quote:
"Evgeny Gridasov" <eugrid (AT) fpm (DOT) kubsu.ru> writes:
my terminal is RU_ru.KOI8-R,
template1's encoding is UTF8.
ILIKE seems to be buggy when comparing russian strings,
while UPPER/LOWER works OK.

I'll bet that the database's locale setting is expecting some encoding
other than UTF8 :-(. You need to have compatible locale and encoding
settings inside the database. You didn't say exactly what the database
LC_COLLATE value is, but if it's RU_ru.KOI8-R, that definitely does not
match UTF8.

regards, tom lane

--
Evgeny Gridasov
Software Engineer
I-Free, Russia

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster


Reply With Quote
  #5  
Old   
Peter Eisentraut
 
Posts: n/a

Default Re: [BUGS] BUG #2261: ILIKE seems to be buggy on koi8 input - 02-21-2006 , 06:55 AM



Evgeny Gridasov wrote:
Quote:
It cannot be modified setting it in postgresql.conf (creation db
constant?) Should I reinit database to get this working or what?
Yes.

Quote:
If I should reinit db, what locale should I choose?
Something like ru_RU.utf8.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings


Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.