Convert MySQL database from one encoding/collation into another
8th February 2008
Most frequent use: convert database from latin1_swedish to utf8_general_ci.
Original script found at: MySQL and UTF-8.
Update: the original script had an error, it would generate queries likes this one (note the bold part):
ALTER TABLE `links` CHANGE `link_rel` `link_rel` varchar(255) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL DEFAULT NULL;
This is clearly wrong syntax (and logic). I fixed this by making comparison to NULL strict (with three equal signs instead of two):
// Does the field default to null, a string, or nothing?
if ($row['Default'] === NULL)
Update 2: based on comment by banesto, I modified the script; now it does not require specifying the from_collation, it’s sufficient to specify to_collation (which will be used for all the fields and tables). The modified code is:
if ($row['Collation'] == ” || $row['Collation'] == $convert_to)
continue;
Update 3: the long-lasting, re-appearing NOT NULL DEFAULT NULL problem is finally fixed.
Update 4: incorporated Russ’s fix to skip numeric fields (in order to leave autoincrement values intact).
Here’s the script itself: (to copy-paste: first click the “Plain text” header)
- <?php
- // Original script (v1.0) by/from: http://www.phpwact.org/php/i18n/utf-8/mysql
- // Improved/modified (v1.05) by Bogdan http://bogdan.org.ua/
- // Last updated: 2010-06-28
- // this script will output all queries needed to change all fields/tables to a different collation
- // it is HIGHLY suggested you make a MySQL backup prior to running any of the generated queries
- // this code is provided AS IS and without any warranty
- // add text/plain header when used not as a CLI script; PHP CLI SAPI shouldn't output headers
- header("Content-Type: text/plain");
- // precaution
- die("Make a backup of your MySQL database, then remove this line from the code!");
- set_time_limit(0);
- // collation you want to change to:
- $convert_to = 'utf8_general_ci';
- // character set of new collation:
- $character_set= 'utf8';
- // DB login information - *modify before use*
- $username = 'user';
- $password = 'pass';
- $database = 'database_name';
- $host = 'localhost';
- //-- usually, there is nothing to modify below this line --//
- // show TABLE alteration queries?
- $show_alter_table = true;
- // show FIELD alteration queries?
- $show_alter_field = true;
- mysql_connect($host, $username, $password);
- mysql_select_db($database);
- $rs_tables = mysql_query(" SHOW TABLES ") or die(mysql_error());
- while ($row_tables = mysql_fetch_row($rs_tables)) {
- $table = mysql_real_escape_string($row_tables[0]);
- // Alter table collation
- // ALTER TABLE `account` DEFAULT CHARACTER SET utf8
- if ($show_alter_table)
- echo("ALTER TABLE `$table` DEFAULT CHARACTER SET $character_set;\n");
- $rs = mysql_query(" SHOW FULL FIELDS FROM `$table` ") or die(mysql_error());
- while ( $row = mysql_fetch_assoc($rs) ) {
- if ( $row['Collation'] == '' || $row['Collation'] == $convert_to )
- continue;
- // Is the field allowed to be null?
- if ( $row['Null'] == 'YES' )
- $nullable = ' NULL ';
- else
- $nullable = ' NOT NULL ';
- // Does the field default to null, a string, or nothing?
- if ( $row['Default'] === NULL && $row['Null'] == 'YES' )
- $default = " DEFAULT NULL ";
- elseif ( $row['Default'] != '' )
- $default = " DEFAULT '".mysql_real_escape_string($row['Default'])."'";
- else
- $default = '';
- // sanity check and fix:
- if ($nullable == ' NOT NULL ' && $default == ' DEFAULT NULL ') {
- $default = '';
- echo "/* Warning: wrong combination of 'default value' and 'NULL-flag' detected - and fixed! */\n";
- echo "/* Diagnostics: row[Null] = '$row[Null]', row[Default] = " . mysql_real_escape_string($row['Default']) . ", MySQL version: " . mysql_get_server_info() . " */\n";
- }
- // Don't alter INT columns: no collations, and altering them drops autoincrement values
- if (strpos($row['Type'], 'int') !== false) {
- $show_alter_field = False;
- }
- else {
- $show_alter_field = True;
- }
- // Alter field collation:
- // ALTER TABLE `tab` CHANGE `field` `field` CHAR( 5 ) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL
- if ($show_alter_field) {
- $field = mysql_real_escape_string($row['Field']);
- echo "ALTER TABLE `$table` CHANGE `$field` `$field` $row[Type] CHARACTER SET $character_set COLLATE $convert_to $nullable $default;\n";
- }
- }
- }
- ?>
June 22nd, 2008 at 1:47
BIG thanks for this script. i updated it a bit and it was just priceless. saved several hours of my life
June 22nd, 2008 at 21:27
there are several things:
1) before setting new character set all fields have to be changed to binary equivalents
2) it’s not so good to depend on “convert_from”, because db can contain many collations and script will take effect only in one of them. i think it’s more convenient to check if ($row['Collation']!=”)
June 24th, 2008 at 9:26
Banesto,
1) why convert to binary?
2) do you think it’s a good optional solution to just provide the configuration for “target” character set, and let the script auto-detect the original set? that sounds good to me…
June 26th, 2008 at 21:14
1) because if i leave original type all data gets corrupted
2) because there’s usually need to get one collation for whole db and not to separate them. and it requires me to know what is the original collation in db is right now.
June 27th, 2008 at 10:35
Banesto,
I changed the script a little bit by removing the source_collation and adding the check of row[collation] as you suggested.
As for the conversion to binary types – I didn’t do that, in part because I’ve no idea on what are the best binary equivalents to all the column types in MySQL, and in part due to the lack of desire to dig into that
September 20th, 2008 at 20:18
So, I made a php file and put it in my root folder then loaded it in my browser. It output a page like:
“ALTER TABLE `access` DEFAULT CHARACTER SET utf8;
ALTER TABLE `access` CHANGE `mask` `mask` varchar(255) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL ; ”
etc. It shows this for each table and field.
But when I actually look at my database nothing has been changed. It’s the same encoding/collation as it was to begin with.
What am I doing wrong?
September 20th, 2008 at 20:42
Tim,
the script itself doesn’t modify the database – it only produces all the necessary queries.
You have to (manually) copy-paste all the generated queries into the “SQL” text field of your favourite MySQL administration/management software, and “Run” (or how the execute command is named in your software). E.g. in PhpMyAdmin you’d go to the database structure page, then choose the SQL tab, paste all the queries, and run them.
September 20th, 2008 at 23:08
Thanks,
It seems to be working. I get errors like this:
“SQL query:
ALTER TABLE `comments` CHANGE `thread` `thread` varchar( 255 ) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL DEFAULT NULL ;
MySQL said: Documentation
#1067 – Invalid default value for ‘thread’ ”
then I change that one manually and rerun the remainder of the queries that start after the erroneous line and repeat.
September 22nd, 2008 at 21:21
hello
my problem is i have taken backup from mysql>data folder -complete folder of a database
and now copy it to different pc
means
copy from 1PC >Mysql>Data > and copy to 2ndPC >mysql>data
now when i am browsing it from phpmyadmin , its shwoing the database and table but when i am browsing table its saying table doesnt exist.
Collation type show in use ?
Do you have any idea , which can help me ?
Raj
September 22nd, 2008 at 22:17
Raj,
this isn’t MySQL support forum, and your issue doesn’t seem to be collation-related.
However, I would recommend that you use specific tools for MySQL DB export/import, and not just copy data folders.
If you do copy files – check that MySQL versions are the same, configuration is also the same, and try running the MySQL’s “repair” query on all the “copied” files. Might help, but still not a good way.
September 24th, 2008 at 23:10
This is probably a big ol’ newbie problem, but the script doesn’t work for me: I’m getting “access denied” errors.
Specifically, I cut and pasted the script into a file, backed up the database and removed that line, and then uploaded the script to the top level of my Drupal install and went to that URL in Firefox, where I got the following errors:
Warning: mysql_connect() [function.mysql-connect]: Access denied for user ‘user’@'localhost’ (using password: YES) in /home/manatee/public_html/acip/collation.php on line 21
Warning: mysql_select_db() [function.mysql-select-db]: Access denied for user ‘manatee’@'localhost’ (using password: NO) in /home/manatee/public_html/acip/collation.php on line 22
Warning: mysql_select_db() [function.mysql-select-db]: A link to the server could not be established in /home/manatee/public_html/acip/collation.php on line 22
Warning: mysql_query() [function.mysql-query]: Access denied for user ‘manatee’@'localhost’ (using password: NO) in /home/manatee/public_html/acip/collation.php on line 23
Warning: mysql_query() [function.mysql-query]: A link to the server could not be established in /home/manatee/public_html/acip/collation.php on line 23
Access denied for user ‘manatee’@'localhost’ (using password: NO)
Am I missing something obvious?
Thanks,
Andrea
September 25th, 2008 at 16:40
Andrea,
please triple-check these configuration values:
[PHP]$username = ‘user’; $password = ‘pass’; $database = ‘table’; $host = ‘localhost’;[/PHP]
October 7th, 2008 at 15:35
Hi Bogdan,
nice script! Looks like a life-saver.
If you should decide to dive into the whole data-conversion aspect of this, here is a little cheat sheet of how I have converted data before.
char, varchar and text -> blob
mediumtext -> mediumblob
longtext -> longblob
(I have not used tinyblob but it might be ok for char and varchar)
There are two ways that the existing data may need conversion. This is something that the user must know and decide. I am not aware of any way to detect it, except by the errors if you do it wrong
You would think that an existing latin1 database would have latin1 data in it. This is what scenario shows.
Common though is that you can have php set up to use utf8 but the database remains in latin1. Any existing data will then me utf8 squeezed into a latin1 table. This needs a slightly different conversion as in the second scenario.
# case where data was enterd correctly as latin1 and later needs conversion to utf8
SET NAMES latin1;
drop table people;
CREATE TABLE people (
id int(11) unsigned NOT NULL auto_increment,
name varchar(64) default NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
INSERT INTO people VALUES
(1,’user_a’),
(2,’user_Ã¥’),
(3,’user_ä’);
select * from people;
SET NAMES utf8;
alter table people engine=MyISAM default character set utf8;
ALTER TABLE people CHANGE name name BLOB;
ALTER TABLE people CHANGE name name varchar(64) CHARACTER SET utf8;
select * from people;
# case where utf8 data was enterd into a table with latin1 and later needs conversion to utf8
SET NAMES latin1;
drop table people;
CREATE TABLE people (
id int(11) unsigned NOT NULL auto_increment,
name varchar(64) default NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
SET NAMES utf8;
INSERT INTO people VALUES
(1,’user_a’),
(2,’user_Ã¥’),
(3,’user_ä’);
select * from people;
alter table people engine=MyISAM default character set utf8;
ALTER TABLE people CHANGE name name BLOB;
ALTER TABLE people CHANGE name name varchar(64) CHARACTER SET latin1;
ALTER TABLE people CHANGE name name varchar(64) CHARACTER SET utf8;
select * from people;
I’ll get back to you if I manage to implement something like this in your script.
October 7th, 2008 at 16:11
Hi again,
I got a bit carried away and did some tweaks to your script to support the two data conversions described above.
There may be cases where it does not work but it works when I have tested it on a bunch of conversions.
in the configuration at the top:
// conversion of data for either normal data or “messed up” data
// either leave empty or choose convert or rescue
$data_conversion = ”;
and then at the bottom in the field conversion part.
// Alter field collation:
// ALTER TABLE `tab` CHANGE `fiel` `fiel` CHAR( 5 ) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL
if ($show_alter_field) {
$field = mysql_real_escape_string($row['Field']);
// charset conversion table for data conversions
$types = array(
‘default’=>’blob’,
‘mediumtext’ => ‘mediumblob’,
‘longtext’ => ‘longblob’
);
if ( !empty($data_conversion) && strpos($row['Collation'], $character_set) !== 0 ) {
if ( array_key_exists($row['Type'],$types) ) {
$binary = $types[$row['Type']];
} else {
$binary = $types['default'];
}
echo “ALTER TABLE `$table` CHANGE `$field` `$field` $binary $nullable $default; \r\n”;
if ( $data_conversion == ‘rescue’ ) {
$old_charset = substr($row['Collation'],0,strpos($row['Collation'],’_'));
echo “ALTER TABLE `$table` CHANGE `$field` `$field` $row[Type] CHARACTER SET $old_charset $nullable $default; \r\n”;
}
}
echo “ALTER TABLE `$table` CHANGE `$field` `$field` $row[Type] CHARACTER SET $character_set COLLATE $convert_to $nullable $default; \r\n”;
}
October 7th, 2008 at 16:24
I found another bug. Your script will actually remove the “auto_increment” attribute.
I didn’t know how to read this attribute from the current row to put it back, so I worked round it by modding the script to ignore numeric fields as these do not have a collation anyway.
October 7th, 2008 at 16:51
Dear Martin and Russ,
thank you for valuable feedback!
With your permission, I’ll modify the script to take into account your code and suggestions.
(I’m at work right now, so most probably will update the script code late tonight.)
Update: still not done, but now in the process of updating code.
October 10th, 2008 at 21:23
Martin,
why would you need these conversions?
char, varchar and text -> blob
mediumtext -> mediumblob
longtext -> longblob
blobs are for binary data, so why put the text into binary fields?
October 10th, 2008 at 23:58
When you change the encoding, any existing data will be missinterpreted whenever the charsets differ. Converting the field via a binary type will allow you to trick MySQL into reinterpreting the “meaning” of the data.
This will not affect A-Z but other (accented) characters.
October 11th, 2008 at 0:39
Sorry. My last post was a bit crippled from being written on my phone in the car.
There are probably many people that can explain the details of why you do it this way. But what I know is that if you can to “convert” existing data from one Character set to another in MySQL, you do it by making the field a binary (and thus tricking MySQL to reinterpret the raw data stored when you bring it back into a text field in another character set).
I have implemented two such conversions for two common situations. But you can play around with this quite a bit if you have some really messed up data.
December 6th, 2008 at 17:56
Here is my version of the script if it is of any use to anyone, I made a couple of mods to get around the NULL issues and the auto_increment issue I mentioned before.
<?php
// original script (v1.0) by/from: http://www.phpwact.org/php/i18n/utf-8/mysql
// improved/modified (v1.02) by Bogdan Tokovenko http://bogdan.org.ua/
// this script will output all queries needed to change all fields/tables to a different collation
// it is HIGHLY suggested you take a MySQL dump prior to running any of the generated queries
// this code is provided AS IS and without any warranty
set_time_limit(0);
// collation you want to change to:
$convert_to = ‘utf8_general_ci’;
// character set of new collation:
$character_set= ‘utf8′;
$show_alter_table = true;
$show_alter_field = true;
// DB login information – modify before use
$username = ‘username’;
$password = ‘password’;
$database = ‘database_name’;
$host = ‘localhost’;
//– nothing to modify below this line –//
mysql_connect($host, $username, $password);
mysql_select_db($database);
$rs_tables = mysql_query(” SHOW TABLES “) or die(mysql_error());
print ”;
while ($row_tables = mysql_fetch_row($rs_tables)) {
$table = mysql_real_escape_string($row_tables[0]);
// Alter table collation
// ALTER TABLE `account` DEFAULT CHARACTER SET utf8
if ($show_alter_table)
echo(“ALTER TABLE `$table` DEFAULT CHARACTER SET $character_set;\r\n”);
$rs = mysql_query(” SHOW FULL FIELDS FROM `$table` “) or die(mysql_error());
while ($row=mysql_fetch_assoc($rs)) {
if ($row['Collation'] == ” || $row['Collation'] == $convert_to)
continue;
// Is the field allowed to be null?
if ( $row['Null'] == ‘YES’ )
$nullable = ‘ NULL ‘;
else
$nullable = ‘ NOT NULL’;
// Does the field default to null, a string, or nothing?
if ( $row['Default'] === NULL )
$default = ” DEFAULT NULL”;
else if ($row['Default']!=”)
$default = ” DEFAULT ‘”.mysql_real_escape_string($row['Default']).”‘”;
else
$default = ”;
//don’t alter int columns as they do not have a collation anyway and it buggers up the extra parameters
if (strpos($row[Type], ‘int’) !== false)
$show_alter_field = false;
else
$show_alter_field = true;
// Alter field collation:
// ALTER TABLE `tab` CHANGE `fiel` `fiel` CHAR( 5 ) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL
if ($show_alter_field) {
$field = mysql_real_escape_string($row['Field']);
echo “ALTER TABLE `$table` modify `$field` $row[Type] COLLATE $convert_to ; \r\n”;
}
}
}
?>
January 27th, 2009 at 22:40
Is anyone still following this string? Have searched high and low for a solution and this might be it. Would like to build a simple website with Joomla 1.5.9 adding a few components such as blogs, forums, profile. The problem I am encountering is what you have been discussing here. I’ve loaded everything tested, but my East Asian fonts come out as ?????. I’ve converted the database and table to utf8_general_ci, but can’t get the columns with a manual for each individual column using the ALTER command. That would take days and I think a problem with the binary…. If it makes a difference, MySQL = 4.1.22, Server Version 5.0.67
I’ve tried the scriptd by Bogdan and Russ as well as the link supplied by Bogdan at the top. Since I really know nothing about MySQL I do not understand how to fix the error messages I’m getting:
1. FIRST ERROR at top and down.)
ERROR: Unknown Punctuation String @ 1
STR: <?
SQL: <?php
// this script will output the queries need to change all fields/tables to a different collation
// it is HIGHLY suggested you take a MySQL dump prior to running any of the generated
// this code is provided as is and without any warranty
set_time_limit(0);<?php
2. SECOND ERROR at bottom)
SQL query:
<?php // this script will output the queries need to change all fields/tables to a different collation // it is HIGHLY suggested you take a MySQL dump prior to running any of the generated // this code is provided as is and without any warranty set_time_limit(0);
MySQL said: Documentation
#1064 – You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ‘<?php
// this script will output the queries need to change all fields/table’ at line 1
January 27th, 2009 at 23:56
Paulo,
1. did you really mean to say that your MySQL client is at version 4.1.22, and Server’s Version is 5.0.67? this is weird, and if it is really the case, then it could be causing you miscellaneous troubles.
2. instead of converting the database, try to set it to utf8_general_ci *before* Joomla installation; if Joomla allows that – ask it to install with utf8 support in DB from the very beginning. If Joomla doesn’t support that – I would suggest that you try Drupal 6.x, it might work perfectly out of the box for you.
3. the errors you list look like you are trying to feed the PHP script to MySQL…. are you accessing the script with your browser, or just feeding it verbatim to the mysql?
February 12th, 2009 at 2:46
You just saved me SOOO much time! Thank you!
April 16th, 2009 at 20:56
Hi Bogdan,
thanks for the script very useful…
however I found on some fields that are indexed I get an #1283 error, I’m not an sql guru so was wondering if you could add a check to see if the field is indexed and ommit it or make another list of such fields, so one can drop the index of those fields convert it then re-add the index. thanks again.
April 16th, 2009 at 21:14
Looking at the explanation of error #1283:
Error: 1283 SQLSTATE: HY000 (ER_BAD_FT_COLUMN)
Message: Column ‘%s’ cannot be part of FULLTEXT index
it looks like the problem might be with those specific columns, not with the change of collation.
May 13th, 2009 at 11:42
What if you have to alter the database?
ALTER DATABASE `dbname` DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci
Should you first create a new DB and copy the tables to it in this new format?
May 14th, 2009 at 22:05
database alteration shouldn’t have any effect in newer versions of MySQL (I think since 5.x), because individual column collation settings override database-level collation specification.
I hope I understood your question correctly.
May 15th, 2009 at 3:45
Ohhh well thats excellent then. Thank you for your response =)
Wouldn’t any new tables being created would still get created with the old collation if the database wasn’t altered wouldnt it?
I also once read somewhere that you have to create a completely new database for it to work properly but perhaps that was only versions of MySQL below 5.x
May 15th, 2009 at 18:26
I guess any new tables created inherit the database’s collation settings, unless those new tables have their own specific collation settings specified. So if you are working on a database structure “on the fly/as you work”, you may want to specify the exact collation you want by default as database’s collation, so that any new tables you create have that collation and not the default latin1_swedish.
May 19th, 2009 at 14:47
I guess the source of these problem is that ALL columns in fulltext index should have same collation and error appears when we change collation of one of the columns contained in full text index.
The only solution that I know is to drop index , change collation and recreate index again. But it would be nice if somebody can write the script autоmating this …
May 27th, 2009 at 14:21
Great script. Thanks a lot.
// Does the field default to null, a string, or nothing?
if ( $row['Null'] == 'YES' && $row['Default'] === NULL )
$default = " DEFAULT NULL";
elseif ( $row['Default'] != '' )
$default = " DEFAULT '".mysql_real_escape_string($row['Default'])."'";
else
$default = '';
will fix null problem
July 10th, 2009 at 20:31
Awesome script, thanks. Especially the very smart “output only” aspect of the script so one can review the lines of SQL.
I have one tiny thing I changed which was a CLI switch. It does not print “< pre >” unless run over HTTP (Line 38):
if ($_SERVER['HTTP_HOST']){
print ‘< pre >‘;
}
An alternative to this is to just output at “Content-type: text/plain”. That will tell the browser to print newlines.
Seeya…
July 11th, 2009 at 0:50
JDS,
I’ve added text/plain header, that makes this script cleaner indeed.
August 31st, 2009 at 8:08
hi all,
I have a simple way:
First export DB to file .sql then find “latin1″ and replace all as “utf8″ with your text editor (i’m use text wrangler).
Second, run this script:
ALTER DATABASE `your DB name` DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci
Import file .sql for the final step.
That ‘s all, very simple. By this way all old data will by change to UTF-8. Have a nice day.
September 8th, 2009 at 22:31
Hi,
There’s still an error with some fields that say: “NOT NULL DEFAULT NULL”
Here’s a fix:
// Does the field default to null, a string, or nothing?
if ( $row['Default'] === NULL )
$default = $row['Null'] == ‘YES’ ? ” DEFAULT NULL ” : “”;
elseif ( $row['Default'] != ” )
$default = ” DEFAULT ‘”.mysql_real_escape_string($row['Default']).”‘”;
else
$default = ”;
September 9th, 2009 at 10:22
Joaquin,
thank you for contributing, I’ve updated the script.
September 27th, 2009 at 15:03
hi, thx for the script, when i tried, everything work except this error in sql.
SQL query:
ALTER TABLE `node_revisions` CHANGE `body` `body` longtext CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL ;
MySQL said:
#1283 – Column ‘body’ cannot be part of FULLTEXT index
February 16th, 2010 at 22:25
take a look at this it may be helpful , it works for me
change-collation-on-all-tables-and-columns-in-mysql
Thanks , good work
June 3rd, 2010 at 3:01
Hi, and THANK YOU SO MUCH!
I also got the problem “NOT NULL DEFAULT NULL” but did a simple find/replace to get the problem fixed!
After that everything worked like a charm
Thanks again!
June 14th, 2010 at 13:52
Thanks for the script. It helped us to quickly solve a problem we had with a database with multiple collations.
June 28th, 2010 at 13:13
Error
SQL query:
ALTER TABLE `agenda_categorie` CHANGE `titre` `titre` varchar( 225 ) CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL DEFAULT NULL ;
MySQL said: Documentation
#1067 – Invalid default value for ‘titre’
help plz!!!
June 28th, 2010 at 14:03
I’ve just updated the script – try it and let me know if that works for you.
Hopefully, this is the last fix for this annoying problem.
July 23rd, 2010 at 6:15
Thanks so much!! This was just the script I needed to fix my Drupal website that was having this problem. Found you through the Drupal issue queues. Now my database is much happier.
December 13th, 2010 at 13:33
Dear Bogdan
Many thanks for your useful script, it has done a perfect job in converting my database from latin1_swedish_ci to utf8_general_ci.
I was wondering if the generated SQL could be optimized, by grouping the changes for multiple columns to convert in a given table, into a single ALTER TABLE statement.
The reason I’m asking you this, is that in the DB I just converted there is a large MyISAM table containing a BLOB (1.2 GB data file) as well as 6 varchar columns. Your script generated 7 distinct ALTER TABLE statements (one for the table, plus one for each column to convert), which ran in 8’20, 4’15, 3’52, 4’22, 6’33, 5’10 and 5’11 respectively.
Thanks again.
December 13th, 2010 at 13:58
Yes, it could be done. Looks like the only change needed is merging and comma-separating all the CHANGE statements (according to mysql refman).
I’ll consider implementing this performance improvement soon.
December 26th, 2010 at 13:07
Indeed, a great script, STILL, it didn’t work for me
I am using MODx CMS and needed such a script to alter my collation to utf8
The script ran pretty well and here is a short snippet I got on my screen:
ALTER TABLE `sitename_access_actiondom` DEFAULT CHARACTER SET utf8;
ALTER TABLE `sitename_access_actiondom` CHANGE `target` `target` varchar(100) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL ;
ALTER TABLE `sitename_access_actiondom` CHANGE `principal_class` `principal_class` varchar(100) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL DEFAULT 'modPrincipal';
ALTER TABLE `sitename_access_actions` DEFAULT CHARACTER SET utf8;
ALTER TABLE `sitename_access_actions` CHANGE `target` `target` varchar(100) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL ;
ALTER TABLE `sitename_access_actions` CHANGE `principal_class` `principal_class` varchar(100) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL DEFAULT 'modPrincipal';
.............
I checked my db in phpMyAdmin and still, all fields' collation have 'latin1_swedish_ci'
What am i missing here?
December 26th, 2010 at 15:33
Sherlock,
the script does not modify the database – this is a precaution.
You have to run these queries yourself.
This is to ensure that you take full responsibility for your actions, and have all the backups you need.
You may use phpmyadmin to run those queries.
February 16th, 2011 at 12:57
Wonderful, huge time saver – thank you!
David
March 19th, 2011 at 16:28
[...] I also found the following scripts that might be handy: http://www.phpwact.org/php/i18n/utf-8/mysql http://bogdan.org.ua/2008/02/08/convert-mysql-database-from-one-encodingcollation-into-another.html This entry was posted in Scripts and tagged charset, collation, convert, mysql, php, script. [...]
April 13th, 2011 at 17:46
Is it possible to convert with this script from utf8_general_ci to utf8_unicode_ci?
April 13th, 2011 at 23:41
Lenny23: yes. Just set $convert_to = ‘utf8_unicode_ci’.
May 5th, 2011 at 8:39
hi Bogdan
hi evrry one
please just tell me how to use the script !
copy the text to were ?
please tell me
realy i dont know how to use it
thank you
May 5th, 2011 at 23:38
@Sofian,
copy-paste the script into a PHP file on your web-hosting/server (e.g. “convert.php”).
Then access the script with your browser (http://your-domain-here/path-to/convert.php), follow the instruction for DB backup and code line removal.
You will get a set of SQL statements which you can then copy-paste into your DB admin tool (most commonly phpMyAdmin).
Alternatively, you can run this script from the command line (e.g. using php-cli) directly on the server.
May 6th, 2011 at 8:07
sorry
it not work for me .
i get download the file .
and wen paste it , the numbers paste too , i clear them .
i want to ask you please :
i have afield on my database that not have colaction , and it wan’t to change to utf8 (by phpmyadmin ) , is your script force it to change ?
thank you
May 6th, 2011 at 15:10
If there is no Collation for a field, then there is nothing to change – this script will do nothing as well.
May 7th, 2011 at 10:47
ok thank you verry much bogdan
July 6th, 2011 at 21:01
Please take a look at http://stackoverflow.com/questions/6600534/easiest-way-to-repair-charset-and-collation-on-a-mysql-database because this does include an answer that really works. You current script doesnt work in all cases and even when it does it sets the charset and collation for each column and this is not a good practice, instead it should only set it at database level and reset table and column to defaults (meaning database value). Also the collation you setup is not the recommended one.
July 27th, 2015 at 22:36
Your script is still helpful more than 7 years later!!! My shopping cart provider had a bug in their UTF 8 conversion script, which was part of the upgrade process to a newer version of their cart. Their script had the same NOT NULL DEFAULT NULL problem that gave you such a hard time. You’ve provided a marvelous service in making this script available and modifying it based on community feedback!
July 28th, 2015 at 0:00
I’m glad it is useful
By the way, technically the latest script version is _only_ about 5 years old – but yeah, that’s surprisingly long anyway in the modern age of permanent change
May 9th, 2019 at 20:40
Updated for mysqli:
query(" SHOW TABLES ") or die($mysqli->error);
while ($row_tables = $rs_tables->fetch_row()) {
$table = $mysqli->real_escape_string($row_tables[0]);
// Alter table collation
// ALTER TABLE `account` DEFAULT CHARACTER SET utf8
if ($show_alter_table)
echo("ALTER TABLE `$table` DEFAULT CHARACTER SET $character_set;\n");
$rs = $mysqli->query(" SHOW FULL FIELDS FROM `$table` ") or die($mysqli->error);
while ($row = $rs->fetch_assoc()) {
if ($row['Collation'] == '' || $row['Collation'] == $convert_to)
continue;
// Is the field allowed to be null?
if ($row['Null'] == 'YES')
$nullable = ' NULL ';
else
$nullable = ' NOT NULL ';
// Does the field default to null, a string, or nothing?
if ($row['Default'] === null && $row['Null'] == 'YES')
$default = " DEFAULT NULL ";
elseif ($row['Default'] != '')
$default = " DEFAULT '" . $mysqli->real_escape_string($row['Default']) . "'";
else
$default = '';
// sanity check and fix:
if ($nullable == ' NOT NULL ' && $default == ' DEFAULT NULL ') {
$default = '';
echo "/* Warning: wrong combination of 'default value' and 'NULL-flag' detected - and fixed! */\n";
echo "/* Diagnostics: row[Null] = '$row[Null]', row[Default] = " . $mysqli->real_escape_string($row['Default']) . ", MySQL version: " . $mysqli->get_server_info() . " */\n";
}
// Don't alter INT columns: no collations, and altering them drops autoincrement values
if (strpos($row['Type'], 'int') !== false) {
$show_alter_field = false;
} else {
$show_alter_field = true;
}
// Alter field collation:
// ALTER TABLE `tab` CHANGE `field` `field` CHAR( 5 ) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL
if ($show_alter_field) {
$field = $mysqli->real_escape_string($row['Field']);
echo "ALTER TABLE `$table` CHANGE `$field` `$field` $row[Type] CHARACTER SET $character_set COLLATE $convert_to $nullable $default;\n";
}
}
}