dbTalk Databases Forums  

Problems importing Unicode

comp.databases.postgresql.general comp.databases.postgresql.general


Discuss Problems importing Unicode in the comp.databases.postgresql.general forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
matthias@cmklein.de
 
Posts: n/a

Default Problems importing Unicode - 11-16-2004 , 06:58 PM






I have batch files with entries such as

INSERT INTO country VALUES (248,'ALA','AX','Åland Islands');
INSERT INTO country VALUES (384,'CIV','CI','Côte d\'Ivoire');

I tried to execute them using "pgsql \i filename.sql"

Unfortunately, I keep getting an error message:
"ERROR: invalid byte sequence for encoding "UNICODE": 0xc56c"

How can that be possible?
My database is set to encoding "UNICODE" and so are the batchfiles.

Why does that not work?

Thanks

Matt


---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org


Reply With Quote
  #2  
Old   
Tatsuo Ishii
 
Posts: n/a

Default Re: Problems importing Unicode - 11-16-2004 , 07:25 PM






Quote:
I have batch files with entries such as

INSERT INTO country VALUES (248,'ALA','AX','Åland Islands');
INSERT INTO country VALUES (384,'CIV','CI','Côte d\'Ivoire');

I tried to execute them using "pgsql \i filename.sql"

Unfortunately, I keep getting an error message:
"ERROR: invalid byte sequence for encoding "UNICODE": 0xc56c"

How can that be possible?
My database is set to encoding "UNICODE" and so are the batchfiles.

Why does that not work?
I bet your batch file is not encoded in UNICODE (UTF-8).
--
Tatsuo Ishii

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo (AT) postgresql (DOT) org



Reply With Quote
  #3  
Old   
matthias@cmklein.de
 
Posts: n/a

Default Re: Problems importing Unicode - 11-16-2004 , 10:43 PM



Well, they were generated by MySQL and I can open them with e.g. the
Windows Editor Notepad. But I don't know if they are actually encoded in
UNICODE.
Since I can open the file with Notepad and read the statements, I assume,
it is not UNICODE. They look just like in the email below.

The problem are apparently those characters Å or ô and I really would like
to know how to import those files into PostgreSQL 8.0.0

Is there a switch I can use to do a codepage / encoding translation?

Why are MS Access or even MySQL able to read those files without trouble
but PostgreSQL reports an error?

Thanks

Matt



--- Ursprüngliche Nachricht ---
Datum: 17.11.2004 02:25
Von: Tatsuo Ishii <t-ishii (AT) sra (DOT) co.jp>
An: matthias (AT) cmklein (DOT) de
Betreff: Re: [GENERAL] Problems importing Unicode

Quote:
I have batch files with entries such as

INSERT INTO country VALUES (248,'ALA','AX','Åland Islands');
INSERT INTO country VALUES (384,'CIV','CI','Côte d\'Ivoire');

I tried to execute them using "pgsql \i filename.sql"

Unfortunately, I keep getting an error message:
"ERROR: invalid byte sequence for encoding "UNICODE": 0xc56c"

How can that be possible?
My database is set to encoding "UNICODE" and so are the batchfiles.

Why does that not work?

I bet your batch file is not encoded in UNICODE (UTF-8).
--
Tatsuo Ishii


---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings



Reply With Quote
  #4  
Old   
Richard Huxton
 
Posts: n/a

Default Re: Problems importing Unicode - 11-17-2004 , 03:20 AM



matthias (AT) cmklein (DOT) de wrote:
Quote:
Well, they were generated by MySQL and I can open them with e.g. the
Windows Editor Notepad. But I don't know if they are actually encoded in
UNICODE.
Since I can open the file with Notepad and read the statements, I assume,
it is not UNICODE. They look just like in the email below.
Probably some WINxxx encoding. I've seen something similar with data
from MS-Access.

Quote:
The problem are apparently those characters Å or ô and I really would like
to know how to import those files into PostgreSQL 8.0.0

Is there a switch I can use to do a codepage / encoding translation?

Why are MS Access or even MySQL able to read those files without trouble
but PostgreSQL reports an error?
Because they're using the same WIN locale details. What you might want
to try is to set your client encoding at the top of the batch file and
see if PostgreSQL can't convert it for you.

SET CLIENT_ENCODING = WIN1250;

There's a list of encodings PG can convert for you in the manual (see
the chapter "Automatic Character Set Conversion Between Server and
Client" in the Localization section.

--
Richard Huxton
Archonet Ltd

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo (AT) postgresql (DOT) org



Reply With Quote
  #5  
Old   
Magnus Hagander
 
Posts: n/a

Default Re: Problems importing Unicode - 11-17-2004 , 05:53 AM



Quote:
Well, they were generated by MySQL and I can open them with
e.g. the Windows Editor Notepad. But I don't know if they are
actually encoded in UNICODE.
Since I can open the file with Notepad and read the
statements, I assume, it is not UNICODE. They look just like
in the email below.
Windows Notepad handles Unicode just fine, both UTF-16 (labeled Unicode
in notepad) and UTF-8 (labeled UTF-8).
To test, open the file in Notepad, then do "File->Save As". The
"Encoding" dropdown box will default to whatever Notepad detected when
it opened the file. If it's UTF-16 and you need UTF-8, just change the
encoding and save under a different name.

//Magnus


---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match



Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.