dbTalk Databases Forums  

Character set, collation question

comp.databases.mysql comp.databases.mysql


Discuss Character set, collation question in the comp.databases.mysql forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Joe Hesse
 
Posts: n/a

Default Character set, collation question - 06-09-2011 , 08:00 PM






Some of the tables in my database have different character sets and/or
collation. The database is working fine but I am worried about some
subtle problems occurring.

I tried:
ALTER DATABASE mydb CHARACTER SET latin1 COLLATE latin1_swedish_ci;
as this, according to the docs, is the default character set and
collation.

After doing this I noticed that the change was not applied to all my
tables. I checked this by importing the model into MySQL Workbench.

Any comments about this would be appreciated.

Reply With Quote
  #2  
Old   
Jerry Stuckle
 
Posts: n/a

Default Re: Character set, collation question - 06-09-2011 , 08:20 PM






On 6/9/2011 9:00 PM, Joe Hesse wrote:
Quote:
Some of the tables in my database have different character sets and/or
collation. The database is working fine but I am worried about some
subtle problems occurring.

I tried:
ALTER DATABASE mydb CHARACTER SET latin1 COLLATE latin1_swedish_ci;
as this, according to the docs, is the default character set and
collation.

After doing this I noticed that the change was not applied to all my
tables. I checked this by importing the model into MySQL Workbench.

Any comments about this would be appreciated.
All that does is set the default collation and charset for the database.
These are used when new tables are created without their own charset
and/or collation. See
http://dev.mysql.com/doc/refman/5.5/...-database.html

If you want to change the charset and collation in a table, you must
change it for each column in the table. See
http://dev.mysql.com/doc/refman/5.5/...et-column.html

--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
jstucklex (AT) attglobal (DOT) net
==================

Reply With Quote
  #3  
Old   
Joe Hesse
 
Posts: n/a

Default Re: Character set, collation question - 06-09-2011 , 09:43 PM



On Thu, 09 Jun 2011 21:20:43 -0400, Jerry Stuckle wrote:

Quote:
On 6/9/2011 9:00 PM, Joe Hesse wrote:
Some of the tables in my database have different character sets and/or
collation. The database is working fine but I am worried about some
subtle problems occurring.

I tried:
ALTER DATABASE mydb CHARACTER SET latin1 COLLATE latin1_swedish_ci; as
this, according to the docs, is the default character set and
collation.

After doing this I noticed that the change was not applied to all my
tables. I checked this by importing the model into MySQL Workbench.

Any comments about this would be appreciated.

All that does is set the default collation and charset for the database.
These are used when new tables are created without their own charset
and/or collation. See
http://dev.mysql.com/doc/refman/5.5/...-database.html

If you want to change the charset and collation in a table, you must
change it for each column in the table. See
http://dev.mysql.com/doc/refman/5.5/...et-column.html
I was thinking that I could change every table in my database to the
desired charset and collation by creating a dump file and then edit the
CREATE TABLE statements, after the column list. The editing could be
done with sed. It seems to me that this would change the charset and
collation for every table without dealing with individual columns. I am
still new to MySQL and am nervous about subtle problems.
Thank you,
Joe

Reply With Quote
  #4  
Old   
Jerry Stuckle
 
Posts: n/a

Default Re: Character set, collation question - 06-09-2011 , 09:46 PM



On 6/9/2011 10:43 PM, Joe Hesse wrote:
Quote:
On Thu, 09 Jun 2011 21:20:43 -0400, Jerry Stuckle wrote:

On 6/9/2011 9:00 PM, Joe Hesse wrote:
Some of the tables in my database have different character sets and/or
collation. The database is working fine but I am worried about some
subtle problems occurring.

I tried:
ALTER DATABASE mydb CHARACTER SET latin1 COLLATE latin1_swedish_ci; as
this, according to the docs, is the default character set and
collation.

After doing this I noticed that the change was not applied to all my
tables. I checked this by importing the model into MySQL Workbench.

Any comments about this would be appreciated.

All that does is set the default collation and charset for the database.
These are used when new tables are created without their own charset
and/or collation. See
http://dev.mysql.com/doc/refman/5.5/...-database.html

If you want to change the charset and collation in a table, you must
change it for each column in the table. See
http://dev.mysql.com/doc/refman/5.5/...et-column.html

I was thinking that I could change every table in my database to the
desired charset and collation by creating a dump file and then edit the
CREATE TABLE statements, after the column list. The editing could be
done with sed. It seems to me that this would change the charset and
collation for every table without dealing with individual columns. I am
still new to MySQL and am nervous about subtle problems.
Thank you,
Joe


Yes, you could do that. However, make a copy of your dump before doing it.

I don't like doing it that way unless necessary, though. You can run
into all kinds of problems, like foreign key constraints not being met,
etc. It often matters what order you restore the database, and it's
generally not the order it was dumped.

--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
jstucklex (AT) attglobal (DOT) net
==================

Reply With Quote
  #5  
Old   
Axel Schwenke
 
Posts: n/a

Default Re: Character set, collation question - 06-10-2011 , 01:15 AM



Joe Hesse <JoeHesse (AT) gmail (DOT) com> wrote:

Quote:
Some of the tables in my database have different character sets and/or
collation. The database is working fine but I am worried about some
subtle problems occurring.
What "subtle problems" would that be?

Quote:
ALTER DATABASE mydb CHARACTER SET latin1 COLLATE latin1_swedish_ci;
as this, according to the docs, is the default character set and
collation.

After doing this I noticed that the change was not applied to all my
tables.
You certainly don't want this statement to change the character set
and collation of all fields in all tables in that database. And indeed
it does not do it, but only set the default for this database.

You can alter one (or many) fields in a table with an ALTER TABLE
statement. RTFM!

If you change the collation of a column, then all indexes on that
column have to be rebuilt! If you change the charset, then also the
values in the rows will be converted from old charset to the new one.
This operation may fail - depending on your data. I.e. if you convert
from utf8 to latin1 then there might be unconvertible characters in
your data ...

Please make sure to read and understand the manual chapter about the
connection character set and collation.


XL

Reply With Quote
  #6  
Old   
Joe Hesse
 
Posts: n/a

Default Re: Character set, collation question - 06-10-2011 , 08:08 AM



On Thu, 09 Jun 2011 22:46:33 -0400, Jerry Stuckle wrote:

Quote:
On 6/9/2011 10:43 PM, Joe Hesse wrote:
On Thu, 09 Jun 2011 21:20:43 -0400, Jerry Stuckle wrote:

On 6/9/2011 9:00 PM, Joe Hesse wrote:
Some of the tables in my database have different character sets
and/or collation. The database is working fine but I am worried
about some subtle problems occurring.

I tried:
ALTER DATABASE mydb CHARACTER SET latin1 COLLATE latin1_swedish_ci;
as this, according to the docs, is the default character set and
collation.

After doing this I noticed that the change was not applied to all my
tables. I checked this by importing the model into MySQL Workbench.

Any comments about this would be appreciated.

All that does is set the default collation and charset for the
database.
These are used when new tables are created without their own
charset
and/or collation. See
http://dev.mysql.com/doc/refman/5.5/...-database.html

If you want to change the charset and collation in a table, you must
change it for each column in the table. See
http://dev.mysql.com/doc/refman/5.5/...et-column.html

I was thinking that I could change every table in my database to the
desired charset and collation by creating a dump file and then edit the
CREATE TABLE statements, after the column list. The editing could be
done with sed. It seems to me that this would change the charset and
collation for every table without dealing with individual columns. I
am still new to MySQL and am nervous about subtle problems. Thank you,
Joe



Yes, you could do that. However, make a copy of your dump before doing
it.

I don't like doing it that way unless necessary, though. You can run
into all kinds of problems, like foreign key constraints not being met,
etc. It often matters what order you restore the database, and it's
generally not the order it was dumped.
I have read the fine documentation on character sets and collation. It
seems theoretical and I'm still not sure if I will wreck something.

All the characters in my tables are, for want of a better term,
"American", no foreign characters. There are a foreign keys but they are
INT.

I plan on doing, with backups of course:

ALTER TABLE MyTable CHARACTER SET=utf8, COLLATE=utf8_swedish_ci ;

for each of the tables in my database. From what I have read,
"CHARACTER SET = utf8 , COLLATE =utf8_swedish_ci"
is the best choice.

If you see a potential problem in what I am doing, please let me know.

Thank you everyone for sharing your technical knowledge.

Reply With Quote
  #7  
Old   
Jerry Stuckle
 
Posts: n/a

Default Re: Character set, collation question - 06-10-2011 , 11:37 AM



On 6/10/2011 9:08 AM, Joe Hesse wrote:
Quote:
On Thu, 09 Jun 2011 22:46:33 -0400, Jerry Stuckle wrote:

On 6/9/2011 10:43 PM, Joe Hesse wrote:
On Thu, 09 Jun 2011 21:20:43 -0400, Jerry Stuckle wrote:

On 6/9/2011 9:00 PM, Joe Hesse wrote:
Some of the tables in my database have different character sets
and/or collation. The database is working fine but I am worried
about some subtle problems occurring.

I tried:
ALTER DATABASE mydb CHARACTER SET latin1 COLLATE latin1_swedish_ci;
as this, according to the docs, is the default character set and
collation.

After doing this I noticed that the change was not applied to all my
tables. I checked this by importing the model into MySQL Workbench.

Any comments about this would be appreciated.

All that does is set the default collation and charset for the
database.
These are used when new tables are created without their own
charset
and/or collation. See
http://dev.mysql.com/doc/refman/5.5/...-database.html

If you want to change the charset and collation in a table, you must
change it for each column in the table. See
http://dev.mysql.com/doc/refman/5.5/...et-column.html

I was thinking that I could change every table in my database to the
desired charset and collation by creating a dump file and then edit the
CREATE TABLE statements, after the column list. The editing could be
done with sed. It seems to me that this would change the charset and
collation for every table without dealing with individual columns. I
am still new to MySQL and am nervous about subtle problems. Thank you,
Joe



Yes, you could do that. However, make a copy of your dump before doing
it.

I don't like doing it that way unless necessary, though. You can run
into all kinds of problems, like foreign key constraints not being met,
etc. It often matters what order you restore the database, and it's
generally not the order it was dumped.

I have read the fine documentation on character sets and collation. It
seems theoretical and I'm still not sure if I will wreck something.

All the characters in my tables are, for want of a better term,
"American", no foreign characters. There are a foreign keys but they are
INT.

I plan on doing, with backups of course:

ALTER TABLE MyTable CHARACTER SET=utf8, COLLATE=utf8_swedish_ci ;

for each of the tables in my database. From what I have read,
"CHARACTER SET = utf8 , COLLATE =utf8_swedish_ci"
is the best choice.

If you see a potential problem in what I am doing, please let me know.

Thank you everyone for sharing your technical knowledge.




That should work, as long as you're actually using an "American" charset.

As for foreign keys - the problem isn't if they are integers or not
(although that also can cause a problem if you need to change the
charset of a foreign key). The problem comes in when you restore the
data - if table B has a foreign key referencing Table A, you must ensure
that table A has been populated first (if you're using INNOdb, that is).
But backups don't generally take this into account.

There are alternatives, but that's beyond the scope of your question so
I'll stop here.

--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
jstucklex (AT) attglobal (DOT) net
==================

Reply With Quote
  #8  
Old   
Axel Schwenke
 
Posts: n/a

Default Re: Character set, collation question - 06-10-2011 , 12:49 PM



Joe Hesse <JoeHesse (AT) gmail (DOT) com> wrote:

Quote:
All the characters in my tables are, for want of a better term,
"American", no foreign characters.
That would be the "ascii" character set.

Quote:
I plan on doing, with backups of course:

ALTER TABLE MyTable CHARACTER SET=utf8, COLLATE=utf8_swedish_ci ;
Why? Why do you want to sort and compare by Swedish rules?

Quote:
for each of the tables in my database. From what I have read,
"CHARACTER SET = utf8 , COLLATE =utf8_swedish_ci"
is the best choice.
It's not. Certainly not when you plan to store only characters
from the ASCII range.

utf8 in MySQL can be 1-3 bytes per char. Storage limits like max
row length (64K) and max index length (1000 bytes) are in bytes,
not characters. When you specify a column to use utf8, then MySQL
will assume the worst case when it checks the limits. So storing
ascii in utf8 fields will limit you to 1/3 of the actual limits.


XL

Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.