[BUGS] unicode strings are not sorted alphabetically -
08-05-2004
, 10:40 PM
This is a multi-part message in MIME format.
--------------030807020705040601060901
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
This applies to non-english strings (in my case - russian). I've
stumbled upon it on version 7.3.4. ( PostgreSQL 7.4.3 on
i386-portbld-freebsd5.2.1, compiled by GCC cc (GCC) 3.3.3 [FreeBSD]
20031106)
The attached files where created with:
pg_dump -U lib lib > database_dump
psql lib lib -c "select * from authors order by name" > result_of_query
The sorting order seem to be incorrect. Alpabetically they should be
sorted by 'id's as:
1
2
3
5
4
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
There's also another question on it. The russian alhabet differs from
ukrainian alphabet, so sorting should occur in different order. But the
order, provided by unicode charmap isn't good for any of them. This
probably applies to any Cyrillic charset.
--
[WBR], Arcade. [SAT Astronomy/Think to survive!]
--------------030807020705040601060901
Content-Type: text/plain;
name="database_dump"
Content-Transfer-Encoding: 8bit
Content-Disposition: inline;
filename="database_dump"
--
-- PostgreSQL database dump
--
SET client_encoding = 'UNICODE';
SET check_function_bodies = false;
SET SESSION AUTHORIZATION 'pgsql';
--
-- TOC entry 4 (OID 2200)
-- Name: public; Type: ACL; Schema: -; Owner: pgsql
--
REVOKE ALL ON SCHEMA public FROM PUBLIC;
GRANT ALL ON SCHEMA public TO PUBLIC;
SET SESSION AUTHORIZATION 'lib';
SET search_path = public, pg_catalog;
--
-- TOC entry 5 (OID 69028)
-- Name: authors; Type: TABLE; Schema: public; Owner: lib
--
CREATE TABLE authors (
id serial NOT NULL,
name character varying(256)
);
--
-- Data for TOC entry 9 (OID 69028)
-- Name: authors; Type: TABLE DATA; Schema: public; Owner: lib
--
COPY authors (id, name) FROM stdin;
1 Андерсон, Пол
2 Азимов, Айзек
3 Асприн, *оберт
4 Булгаков, Михаил
5 Брэдбери, *ей
6 Гамильтон, *дмонд
7 Гаррисон, Гарри
8 Даррелл, Джеральд
9 Дойл, Артур Конан
10 Кинг, Стивен
11 Кларк, Артур
12 Лукьяненко, Сергей
13 Желязны, *оджер
14 Пол, Фредерик
15 Твен, Марк
16 Пирс, *нтони
17 Саймак, Клиффорд Дональд
18 Силверберг, *оберт
19 Фостер, Алан Дин
20 Фрай, Макс
21 Херберт, Фрэнк
22 Честертон, Гилберт Кийт
23 *нтони, Марк
\.
--
-- TOC entry 7 (OID 69031)
-- Name: authors_id; Type: INDEX; Schema: public; Owner: lib
--
CREATE INDEX authors_id ON authors USING btree (id);
--
-- TOC entry 8 (OID 69032)
-- Name: authors_name; Type: INDEX; Schema: public; Owner: lib
--
CREATE INDEX authors_name ON authors USING btree (name);
--
-- TOC entry 6 (OID 69026)
-- Name: authors_id_seq; Type: SEQUENCE SET; Schema: public; Owner: lib
--
SELECT pg_catalog.setval('authors_id_seq', 23, true);
SET SESSION AUTHORIZATION 'pgsql';
--
-- TOC entry 3 (OID 2200)
-- Name: SCHEMA public; Type: COMMENT; Schema: -; Owner: pgsql
--
COMMENT ON SCHEMA public IS 'Standard public schema';
--------------030807020705040601060901
Content-Type: text/plain;
name="result_of_query"
Content-Transfer-Encoding: 8bit
Content-Disposition: inline;
filename="result_of_query"
id | name
----+--------------------------
10 | Кинг, Стивен
11 | Кларк, Артур
16 | Пирс, *нтони
14 | Пол, Фредерик
23 | *нтони, Марк
19 | Фостер, Алан Дин
20 | Фрай, Макс
22 | Честертон, Гилберт Кийт
13 | Желязны, *оджер
6 | Гамильтон, *дмонд
7 | Гаррисон, Гарри
12 | Лукьяненко, Сергей
17 | Саймак, Клиффорд Дональд
18 | Силверберг, *оберт
15 | Твен, Марк
21 | Херберт, Фрэнк
1 | Андерсон, Пол
2 | Азимов, Айзек
3 | Асприн, *оберт
5 | Брэдбери, *ей
4 | Булгаков, Михаил
8 | Даррелл, Джеральд
9 | Дойл, Артур Конан
(23 rows)
--------------030807020705040601060901
Content-Type: text/plain
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
MIME-Version: 1.0
---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster
--------------030807020705040601060901-- |