logo
Published on Technical articles on: Windows servers, Apache Web Server, MySQL, PHP, IIS (http://www.sitebuddy.com)

MySQL character sets and collation explained

By chris
Created 4 Dec 2006 - 3:32am

Definitions:
character set: Set of symbols and encodings.
collation: Set of rules for comparing characters in a character set

The "Western European" character sets cover most West European languages and have latin1 as the default character set with latin1_swedish_ci as the default collation. This is the most common setting for MySQL users.


There are default settings for character sets and collations at four levels: server, database, table, and column. The following description may appear complex, but it has been found in practice that multiple-level defaulting leads to natural and obvious results.

The default server level can be change in your my.ini configuration file with the directive default-character-set (--default-character-set=character_set_name and ). If not specific the default is latin1. And the default collation for latin1 is latin1_swedish_ci


The index values for the columns are stored in the sorted order according to the collating sequence of the character set defined when these indexes were created. If the server’s character set is changed after having created the tables, the indexed-based queries might not work correctly as the character sets have different collating sequences. You will need to rebuild the index which usually will require to:

First dump the table using mysqldump, then drop the table and finally reload it. The indexes will be rebuilt as the file is loaded.

4.1 is suppose to automatically re-index your table if you change the character set (to be confirmed).

Ref:

http://dev.mysql.com/doc/refman/4.1/en/charset-syntax.html

 


Source URL:
http://www.sitebuddy.com/mssql_info/mysql_character_sets_and_collation_explained