dbTalk Databases Forums  

Cyrillic characters

comp.databases.mysql comp.databases.mysql


Discuss Cyrillic characters in the comp.databases.mysql forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
R C Nesbit
 
Posts: n/a

Default Cyrillic characters - 03-30-2011 , 12:11 PM






Saved to VarChar field in MySQL 5.1.31 doesn't work.

e.g. ?????????

Any clues as to how I can do this?

--
RobP

Reply With Quote
  #2  
Old   
Axel Schwenke
 
Posts: n/a

Default Re: Cyrillic characters - 03-30-2011 , 12:36 PM






R C Nesbit <spam (AT) ukrm (DOT) net> wrote:
Quote:
Saved to VarChar field in MySQL 5.1.31 doesn't work.
e.g. ?????????
Excellent problem description!

Quote:
Any clues as to how I can do this?
First you should RTFM:

http://dev.mysql.com/doc/refman/5.1/en/charset.html

Then you can start to check things:

- what CHARACTER SET is defined for the VARCHAR field?
- what connection character set is the client using when it inserts the
data?
- what is really stored in the table? (the HEX() function helps with
this)
- what connection character set is the client using when it selects the
data?
- what character set uses the console where the clients writes the
string to?

99% of all "characters come out as ???" problems are related to clients
incorrectly specifying the connection charset. I.e. if the client
announces it expects latin1, then there is no way to convert a cyrillic
character stored in utf8 or koi8r to latin1. Thus it will come out as ?


XL

Reply With Quote
  #3  
Old   
J.O. Aho
 
Posts: n/a

Default Re: Cyrillic characters - 03-30-2011 , 12:36 PM



R C Nesbit wrote:
Quote:
Saved to VarChar field in MySQL 5.1.31 doesn't work.

e.g. ?????????

Any clues as to how I can do this?

Check that the table is using a character setup which supports the character
setup that your crylic data is in?

--

//Aho

Reply With Quote
  #4  
Old   
R C Nesbit
 
Posts: n/a

Default Re: Cyrillic characters - 03-30-2011 , 03:04 PM



Axel Schwenke spoke:
Quote:
R C Nesbit <spam (AT) ukrm (DOT) net> wrote:
Saved to VarChar field in MySQL 5.1.31 doesn't work.
e.g. ?????????

Excellent problem description!
Well now you might see my problem!

When I cut n pasted the original into my newsreader it looked fine!

Now it comes out as ?????????

????????? - *that* was just cut-n-pasted from Notepad, and looks like
bog-standard Russian stuff - I bet it comes out after posting as another
?????????

Quote:
Any clues as to how I can do this?

First you should RTFM:

http://dev.mysql.com/doc/refman/5.1/en/charset.html

Then you can start to check things:

And Herein lies the problem.

Trying to import transactions from Ebay.
99% of stuff is UK, 1% is 'other' European languages, so French and
Spanish accented characters fail, as does Cyrillic.

Reply With Quote
  #5  
Old   
R C Nesbit
 
Posts: n/a

Default Re: Cyrillic characters - 03-30-2011 , 03:04 PM



J.O. Aho spoke:
Quote:
Saved to VarChar field in MySQL 5.1.31 doesn't work.

e.g. ?????????

Any clues as to how I can do this?


Check that the table is using a character setup which supports the character
setup that your crylic data is in?

See other post - 99% of the time is is standard ASCII (utf-8)

Reply With Quote
  #6  
Old   
J.O. Aho
 
Posts: n/a

Default Re: Cyrillic characters - 03-30-2011 , 03:45 PM



R C Nesbit wrote:
Quote:
J.O. Aho spoke:
Saved to VarChar field in MySQL 5.1.31 doesn't work.

e.g. ?????????

Any clues as to how I can do this?


Check that the table is using a character setup which supports the character
setup that your crylic data is in?


See other post - 99% of the time is is standard ASCII (utf-8)

Standard ASCII != UTF-8.

You usually get question marks if you type something in like ISO-8859-1 and
then try to display them as if it was UTF-8, the other way around and you get
loads of strange characters.

Your problem is that you assume the text is one charset (A), you have a
database table that uses another charset (B) and it may get even worse if you
display it in yet another charset (C).

1. Be sure you know the original charset
2. Convert the text to the charset of the database table before you store the data
3. Convert the text to the display charset after you have fetched the data
from the database.

To make things simpler, just see to that all three uses the same charset,
those you will have less problems.

--

//Aho

Reply With Quote
  #7  
Old   
Axel Schwenke
 
Posts: n/a

Default Re: Cyrillic characters - 03-31-2011 , 02:38 AM



R C Nesbit <spam (AT) ukrm (DOT) net> wrote:
Quote:
Axel Schwenke spoke:
R C Nesbit <spam (AT) ukrm (DOT) net> wrote:
Saved to VarChar field in MySQL 5.1.31 doesn't work.
e.g. ?????????

Excellent problem description!

Well now you might see my problem!
When I cut n pasted the original into my newsreader it looked fine!
Now it comes out as ?????????
Your original post shows this header:

Content-Type: text/plain; charset=iso-8859-1

It is quite obvious that a latin-1 encoded text cannot contain
cyrillic characters.

Quote:
Trying to import transactions from Ebay.
99% of stuff is UK, 1% is 'other' European languages, so French and
Spanish accented characters fail, as does Cyrillic.
The web solved this problem a decade ago by introducing an universal
encoding. They call it "unicode" (surprise, surprise!)


XL

Reply With Quote
  #8  
Old   
Álvaro G. Vicario
 
Posts: n/a

Default Re: Cyrillic characters - 03-31-2011 , 02:39 AM



El 30/03/2011 22:04, R C Nesbit escribió/wrote:
Quote:
Trying to import transactions from Ebay.
99% of stuff is UK, 1% is 'other' European languages, so French and
Spanish accented characters fail, as does Cyrillic.
Doesn't eBay use UTF-8?


--
-- http://alvaro.es - Álvaro G. Vicario - Burgos, Spain
-- Mi sitio sobre programación web: http://borrame.com
-- Mi web de humor satinado: http://www.demogracia.com
--

Reply With Quote
  #9  
Old   
Álvaro G. Vicario
 
Posts: n/a

Default Re: Cyrillic characters - 03-31-2011 , 02:43 AM



El 30/03/2011 22:04, R C Nesbit escribió/wrote:
Quote:
J.O. Aho spoke:
Saved to VarChar field in MySQL 5.1.31 doesn't work.

e.g. ?????????

Any clues as to how I can do this?


Check that the table is using a character setup which supports the character
setup that your crylic data is in?


See other post - 99% of the time is is standard ASCII (utf-8)
Standard ASCII can encode 128 characters (only 95 of which are
printable). UTF-8 can encode the 1,112,064 characters defined in the
Unicode catalogue. There's a difference!

http://en.wikipedia.org/wiki/ASCII
http://en.wikipedia.org/wiki/UTF-8


--
-- http://alvaro.es - Álvaro G. Vicario - Burgos, Spain
-- Mi sitio sobre programación web: http://borrame.com
-- Mi web de humor satinado: http://www.demogracia.com
--

Reply With Quote
  #10  
Old   
Axel Schwenke
 
Posts: n/a

Default Re: Cyrillic characters - 03-31-2011 , 02:50 AM



"J.O. Aho" <user (AT) example (DOT) net> wrote:
Quote:
Your problem is that you assume the text is one charset (A), you have a
database table that uses another charset (B) and it may get even worse if you
display it in yet another charset (C).
All this could work. Provided the used characters are available
in A, B and C.

Quote:
1. Be sure you know the original charset
2. Convert the text to the charset of the database table before you store the data
No need for that with MySQL (4.1 and later).

Quote:
3. Convert the text to the display charset after you have fetched the data
from the database.
No need for that with MySQL (4.1 and later).

With MySQL you just have to *declare* what character set you use
to send data and what character set you want to get results in.
Then MySQL will automagically recode all strings back and forth.

I.e. the field in the table could be declared ucs2. A client could
insert data in koi8r encoding and MySQL will automatically convert
that to ucs2. Another client might insert data from the ASCII range
and use latin1 to send that data. MySQL would convert latin1->ucs2.

Still another client could select the data, but ask to get utf8
(i.e. to include it in a web page). Then MySQL would convert ucs2
to utf8 automatically. But the (obvious) requirement is, that
clients correctly announce which encoding they use.


XL

Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.