Autarchy of the Private Cave

Tiny bits of bioinformatics, [web-]programming etc

    • Archives

    • Recent comments

    Convert MySQL database from one encoding/collation into another

    8th February 2008

    Most frequent use: convert database from latin1_swedish to utf8_general_ci.
    Original script found at: MySQL and UTF-8.

    Update: the original script had an error, it would generate queries likes this one (note the bold part):

    ALTER TABLE `links` CHANGE `link_rel` `link_rel` varchar(255) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL DEFAULT NULL;

    This is clearly wrong syntax (and logic). I fixed this by making comparison to NULL strict (with three equal signs instead of two):

    // Does the field default to null, a string, or nothing?
    if ($row['Default'] === NULL)

    Update 2: based on comment by banesto, I modified the script; now it does not require specifying the from_collation, it’s sufficient to specify to_collation (which will be used for all the fields and tables). The modified code is:

    if ($row['Collation'] == ” || $row['Collation'] == $convert_to)
    continue;

    Update 3: the long-lasting, re-appearing NOT NULL DEFAULT NULL problem is finally fixed.

    Update 4: incorporated Russ’s fix to skip numeric fields (in order to leave autoincrement values intact).

    Here’s the script itself: (to copy-paste: first click the “Plain text” header)

    1. <?php
    2. // Original script (v1.0) by/from: http://www.phpwact.org/php/i18n/utf-8/mysql
    3. // Improved/modified (v1.05) by Bogdan http://bogdan.org.ua/
    4. // Last updated: 2010-06-28
    5.  
    6. // this script will output all queries needed to change all fields/tables to a different collation
    7. // it is HIGHLY suggested you make a MySQL backup prior to running any of the generated queries
    8.  
    9. // this code is provided AS IS and without any warranty
    10.  
    11. // add text/plain header when used not as a CLI script; PHP CLI SAPI shouldn't output headers
    12. header("Content-Type: text/plain");
    13.  
    14. // precaution
    15. die("Make a backup of your MySQL database, then remove this line from the code!");
    16.  
    17. set_time_limit(0);
    18.  
    19. // collation you want to change to:
    20. $convert_to   = 'utf8_general_ci';
    21.  
    22. // character set of new collation:
    23. $character_set= 'utf8';
    24.  
    25. // DB login information - *modify before use*
    26. $username = 'user';
    27. $password = 'pass';
    28. $database = 'database_name';
    29. $host     = 'localhost';
    30.  
    31. //-- usually, there is nothing to modify below this line --//
    32.  
    33. // show TABLE alteration queries?
    34. $show_alter_table = true;
    35. // show FIELD alteration queries?
    36. $show_alter_field = true;
    37.  
    38. mysql_connect($host, $username, $password);
    39. mysql_select_db($database);
    40.  
    41. $rs_tables = mysql_query(" SHOW TABLES ") or die(mysql_error());
    42.  
    43. while ($row_tables = mysql_fetch_row($rs_tables)) {
    44.     $table = mysql_real_escape_string($row_tables&#91;0&#93;);
    45.    
    46.     // Alter table collation
    47.     // ALTER TABLE `account` DEFAULT CHARACTER SET utf8
    48.     if ($show_alter_table)
    49.         echo("ALTER TABLE `$table` DEFAULT CHARACTER SET $character_set;\n");
    50.  
    51.     $rs = mysql_query(" SHOW FULL FIELDS FROM `$table` ") or die(mysql_error());
    52.  
    53.     while ( $row = mysql_fetch_assoc($rs) ) {
    54.        
    55.         if ( $row&#91;'Collation'&#93; == '' || $row&#91;'Collation'&#93; == $convert_to )
    56.            continue;
    57.  
    58.         // Is the field allowed to be null?
    59.         if ( $row&#91;'Null'&#93; == 'YES' )
    60.            $nullable = ' NULL ';
    61.         else
    62.             $nullable = ' NOT NULL ';
    63.  
    64.         // Does the field default to null, a string, or nothing?
    65.         if ( $row&#91;'Default'&#93; === NULL && $row&#91;'Null'&#93; == 'YES' )
    66.            $default = " DEFAULT NULL ";
    67.         elseif ( $row&#91;'Default'&#93; != '' )
    68.            $default = " DEFAULT '".mysql_real_escape_string($row&#91;'Default'&#93;)."'";
    69.        else
    70.             $default = '';
    71.  
    72.         // sanity check and fix:
    73.         if ($nullable == ' NOT NULL ' && $default == ' DEFAULT NULL ') {
    74.             $default = '';
    75.             echo "/* Warning: wrong combination of 'default value' and 'NULL-flag' detected - and fixed! */\n";
    76.             echo "/* Diagnostics: row&#91;Null&#93; = '$row&#91;Null&#93;', row&#91;Default&#93; = " . mysql_real_escape_string($row&#91;'Default'&#93;) . ", MySQL version: " . mysql_get_server_info() . " */\n";
    77.        }
    78.  
    79.         // Don't alter INT columns: no collations, and altering them drops autoincrement values
    80.         if (strpos($row&#91;'Type'&#93;, 'int') !== false) {
    81.            $show_alter_field = False;
    82.         }
    83.         else {
    84.             $show_alter_field = True;
    85.         }
    86.  
    87.         // Alter field collation:
    88.         // ALTER TABLE `tab` CHANGE `field` `field` CHAR( 5 ) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL
    89.         if ($show_alter_field) {
    90.             $field = mysql_real_escape_string($row&#91;'Field'&#93;);
    91.            echo "ALTER TABLE `$table` CHANGE `$field` `$field` $row&#91;Type&#93; CHARACTER SET $character_set COLLATE $convert_to $nullable $default;\n";
    92.         }
    93.     }
    94. }
    95. ?>
    Share

    60 Responses to “Convert MySQL database from one encoding/collation into another”

    1. banesto Says:

      BIG thanks for this script. i updated it a bit and it was just priceless. saved several hours of my life :)

    2. banesto Says:

      there are several things:
      1) before setting new character set all fields have to be changed to binary equivalents
      2) it’s not so good to depend on “convert_from”, because db can contain many collations and script will take effect only in one of them. i think it’s more convenient to check if ($row['Collation']!=”)

    3. Bogdan Says:

      Banesto,

      1) why convert to binary?
      2) do you think it’s a good optional solution to just provide the configuration for “target” character set, and let the script auto-detect the original set? that sounds good to me…

    4. banesto Says:

      1) because if i leave original type all data gets corrupted
      2) because there’s usually need to get one collation for whole db and not to separate them. and it requires me to know what is the original collation in db is right now.

    5. Bogdan Says:

      Banesto,

      I changed the script a little bit by removing the source_collation and adding the check of row[collation] as you suggested.

      As for the conversion to binary types – I didn’t do that, in part because I’ve no idea on what are the best binary equivalents to all the column types in MySQL, and in part due to the lack of desire to dig into that :)

    6. Tim Says:

      So, I made a php file and put it in my root folder then loaded it in my browser. It output a page like:
      “ALTER TABLE `access` DEFAULT CHARACTER SET utf8;
      ALTER TABLE `access` CHANGE `mask` `mask` varchar(255) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL ; ”
      etc. It shows this for each table and field.

      But when I actually look at my database nothing has been changed. It’s the same encoding/collation as it was to begin with.

      What am I doing wrong?

    7. Bogdan Says:

      Tim,

      the script itself doesn’t modify the database – it only produces all the necessary queries.

      You have to (manually) copy-paste all the generated queries into the “SQL” text field of your favourite MySQL administration/management software, and “Run” (or how the execute command is named in your software). E.g. in PhpMyAdmin you’d go to the database structure page, then choose the SQL tab, paste all the queries, and run them.

    8. Tim Says:

      Thanks,
      It seems to be working. I get errors like this:
      “SQL query:

      ALTER TABLE `comments` CHANGE `thread` `thread` varchar( 255 ) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL DEFAULT NULL ;

      MySQL said: Documentation
      #1067 – Invalid default value for ‘thread’ ”

      then I change that one manually and rerun the remainder of the queries that start after the erroneous line and repeat.

    9. Raj Says:

      hello
      my problem is i have taken backup from mysql>data folder -complete folder of a database
      and now copy it to different pc
      means
      copy from 1PC >Mysql>Data > and copy to 2ndPC >mysql>data

      now when i am browsing it from phpmyadmin , its shwoing the database and table but when i am browsing table its saying table doesnt exist.

      Collation type show in use ?

      Do you have any idea , which can help me ?

      Raj

    10. Bogdan Says:

      Raj,

      this isn’t MySQL support forum, and your issue doesn’t seem to be collation-related.

      However, I would recommend that you use specific tools for MySQL DB export/import, and not just copy data folders.

      If you do copy files – check that MySQL versions are the same, configuration is also the same, and try running the MySQL’s “repair” query on all the “copied” files. Might help, but still not a good way.

    11. Andrea Says:

      This is probably a big ol’ newbie problem, but the script doesn’t work for me: I’m getting “access denied” errors.

      Specifically, I cut and pasted the script into a file, backed up the database and removed that line, and then uploaded the script to the top level of my Drupal install and went to that URL in Firefox, where I got the following errors:


      Warning: mysql_connect() [function.mysql-connect]: Access denied for user ‘user’@'localhost’ (using password: YES) in /home/manatee/public_html/acip/collation.php on line 21

      Warning: mysql_select_db() [function.mysql-select-db]: Access denied for user ‘manatee’@'localhost’ (using password: NO) in /home/manatee/public_html/acip/collation.php on line 22

      Warning: mysql_select_db() [function.mysql-select-db]: A link to the server could not be established in /home/manatee/public_html/acip/collation.php on line 22

      Warning: mysql_query() [function.mysql-query]: Access denied for user ‘manatee’@'localhost’ (using password: NO) in /home/manatee/public_html/acip/collation.php on line 23

      Warning: mysql_query() [function.mysql-query]: A link to the server could not be established in /home/manatee/public_html/acip/collation.php on line 23
      Access denied for user ‘manatee’@'localhost’ (using password: NO)

      Am I missing something obvious?

      Thanks,
      Andrea

    12. Bogdan Says:

      Andrea,

      please triple-check these configuration values:
      [PHP]$username = ‘user’; $password = ‘pass’; $database = ‘table’; $host = ‘localhost’;[/PHP]

    13. Martin Westin Says:

      Hi Bogdan,
      nice script! Looks like a life-saver.

      If you should decide to dive into the whole data-conversion aspect of this, here is a little cheat sheet of how I have converted data before.

      char, varchar and text -> blob
      mediumtext -> mediumblob
      longtext -> longblob
      (I have not used tinyblob but it might be ok for char and varchar)

      There are two ways that the existing data may need conversion. This is something that the user must know and decide. I am not aware of any way to detect it, except by the errors if you do it wrong :)

      You would think that an existing latin1 database would have latin1 data in it. This is what scenario shows.
      Common though is that you can have php set up to use utf8 but the database remains in latin1. Any existing data will then me utf8 squeezed into a latin1 table. This needs a slightly different conversion as in the second scenario.

      # case where data was enterd correctly as latin1 and later needs conversion to utf8
      SET NAMES latin1;
      drop table people;
      CREATE TABLE people (
      id int(11) unsigned NOT NULL auto_increment,
      name varchar(64) default NULL,
      PRIMARY KEY (`id`)
      ) ENGINE=MyISAM DEFAULT CHARSET=latin1;
      INSERT INTO people VALUES
      (1,’user_a’),
      (2,’user_Ã¥’),
      (3,’user_ä’);
      select * from people;
      SET NAMES utf8;
      alter table people engine=MyISAM default character set utf8;
      ALTER TABLE people CHANGE name name BLOB;
      ALTER TABLE people CHANGE name name varchar(64) CHARACTER SET utf8;
      select * from people;

      # case where utf8 data was enterd into a table with latin1 and later needs conversion to utf8
      SET NAMES latin1;
      drop table people;
      CREATE TABLE people (
      id int(11) unsigned NOT NULL auto_increment,
      name varchar(64) default NULL,
      PRIMARY KEY (`id`)
      ) ENGINE=MyISAM DEFAULT CHARSET=latin1;
      SET NAMES utf8;
      INSERT INTO people VALUES
      (1,’user_a’),
      (2,’user_Ã¥’),
      (3,’user_ä’);
      select * from people;
      alter table people engine=MyISAM default character set utf8;
      ALTER TABLE people CHANGE name name BLOB;
      ALTER TABLE people CHANGE name name varchar(64) CHARACTER SET latin1;
      ALTER TABLE people CHANGE name name varchar(64) CHARACTER SET utf8;
      select * from people;

      I’ll get back to you if I manage to implement something like this in your script.

    14. Martin Westin Says:

      Hi again,
      I got a bit carried away and did some tweaks to your script to support the two data conversions described above.
      There may be cases where it does not work but it works when I have tested it on a bunch of conversions.

      in the configuration at the top:
      // conversion of data for either normal data or “messed up” data
      // either leave empty or choose convert or rescue
      $data_conversion = ”;

      and then at the bottom in the field conversion part.

      // Alter field collation:
      // ALTER TABLE `tab` CHANGE `fiel` `fiel` CHAR( 5 ) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL
      if ($show_alter_field) {
      $field = mysql_real_escape_string($row['Field']);

      // charset conversion table for data conversions
      $types = array(
      ‘default’=>’blob’,
      ‘mediumtext’ => ‘mediumblob’,
      ‘longtext’ => ‘longblob’
      );
      if ( !empty($data_conversion) && strpos($row['Collation'], $character_set) !== 0 ) {
      if ( array_key_exists($row['Type'],$types) ) {
      $binary = $types[$row['Type']];
      } else {
      $binary = $types['default'];
      }
      echo “ALTER TABLE `$table` CHANGE `$field` `$field` $binary $nullable $default; \r\n”;
      if ( $data_conversion == ‘rescue’ ) {
      $old_charset = substr($row['Collation'],0,strpos($row['Collation'],’_'));
      echo “ALTER TABLE `$table` CHANGE `$field` `$field` $row[Type] CHARACTER SET $old_charset $nullable $default; \r\n”;
      }
      }

      echo “ALTER TABLE `$table` CHANGE `$field` `$field` $row[Type] CHARACTER SET $character_set COLLATE $convert_to $nullable $default; \r\n”;
      }

    15. Russ Michaels Says:

      I found another bug. Your script will actually remove the “auto_increment” attribute.
      I didn’t know how to read this attribute from the current row to put it back, so I worked round it by modding the script to ignore numeric fields as these do not have a collation anyway.

    16. Bogdan Says:

      Dear Martin and Russ,

      thank you for valuable feedback!

      With your permission, I’ll modify the script to take into account your code and suggestions.

      (I’m at work right now, so most probably will update the script code late tonight.)

      Update: still not done, but now in the process of updating code.

    17. Bogdan Says:

      Martin,

      why would you need these conversions?

      char, varchar and text -> blob
      mediumtext -> mediumblob
      longtext -> longblob

      blobs are for binary data, so why put the text into binary fields?

    18. Martin Westin Says:

      When you change the encoding, any existing data will be missinterpreted whenever the charsets differ. Converting the field via a binary type will allow you to trick MySQL into reinterpreting the “meaning” of the data.
      This will not affect A-Z but other (accented) characters.

    19. Martin Westin Says:

      Sorry. My last post was a bit crippled from being written on my phone in the car.

      There are probably many people that can explain the details of why you do it this way. But what I know is that if you can to “convert” existing data from one Character set to another in MySQL, you do it by making the field a binary (and thus tricking MySQL to reinterpret the raw data stored when you bring it back into a text field in another character set).

      I have implemented two such conversions for two common situations. But you can play around with this quite a bit if you have some really messed up data.

    20. Russ Michaels Says:

      Here is my version of the script if it is of any use to anyone, I made a couple of mods to get around the NULL issues and the auto_increment issue I mentioned before.

      <?php
      // original script (v1.0) by/from: http://www.phpwact.org/php/i18n/utf-8/mysql
      // improved/modified (v1.02) by Bogdan Tokovenko http://bogdan.org.ua/
      // this script will output all queries needed to change all fields/tables to a different collation
      // it is HIGHLY suggested you take a MySQL dump prior to running any of the generated queries
      // this code is provided AS IS and without any warranty
      set_time_limit(0);

      // collation you want to change to:
      $convert_to = ‘utf8_general_ci’;
      // character set of new collation:
      $character_set= ‘utf8′;
      $show_alter_table = true;
      $show_alter_field = true;
      // DB login information – modify before use
      $username = ‘username’;
      $password = ‘password’;
      $database = ‘database_name’;
      $host = ‘localhost’;
      //– nothing to modify below this line –//
      mysql_connect($host, $username, $password);
      mysql_select_db($database);
      $rs_tables = mysql_query(” SHOW TABLES “) or die(mysql_error());
      print ”;
      while ($row_tables = mysql_fetch_row($rs_tables)) {
      $table = mysql_real_escape_string($row_tables[0]);
      // Alter table collation
      // ALTER TABLE `account` DEFAULT CHARACTER SET utf8
      if ($show_alter_table)
      echo(“ALTER TABLE `$table` DEFAULT CHARACTER SET $character_set;\r\n”);
      $rs = mysql_query(” SHOW FULL FIELDS FROM `$table` “) or die(mysql_error());
      while ($row=mysql_fetch_assoc($rs)) {
      if ($row['Collation'] == ” || $row['Collation'] == $convert_to)
      continue;
      // Is the field allowed to be null?
      if ( $row['Null'] == ‘YES’ )
      $nullable = ‘ NULL ‘;
      else
      $nullable = ‘ NOT NULL’;
      // Does the field default to null, a string, or nothing?
      if ( $row['Default'] === NULL )
      $default = ” DEFAULT NULL”;
      else if ($row['Default']!=”)
      $default = ” DEFAULT ‘”.mysql_real_escape_string($row['Default']).”‘”;
      else
      $default = ”;
      //don’t alter int columns as they do not have a collation anyway and it buggers up the extra parameters
      if (strpos($row[Type], ‘int’) !== false)
      $show_alter_field = false;
      else
      $show_alter_field = true;
      // Alter field collation:
      // ALTER TABLE `tab` CHANGE `fiel` `fiel` CHAR( 5 ) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL
      if ($show_alter_field) {
      $field = mysql_real_escape_string($row['Field']);
      echo “ALTER TABLE `$table` modify `$field` $row[Type] COLLATE $convert_to ; \r\n”;
      }
      }
      }
      ?>

    21. paulo Says:

      Is anyone still following this string? Have searched high and low for a solution and this might be it. Would like to build a simple website with Joomla 1.5.9 adding a few components such as blogs, forums, profile. The problem I am encountering is what you have been discussing here. I’ve loaded everything tested, but my East Asian fonts come out as ?????. I’ve converted the database and table to utf8_general_ci, but can’t get the columns with a manual for each individual column using the ALTER command. That would take days and I think a problem with the binary…. If it makes a difference, MySQL = 4.1.22, Server Version 5.0.67

      I’ve tried the scriptd by Bogdan and Russ as well as the link supplied by Bogdan at the top. Since I really know nothing about MySQL I do not understand how to fix the error messages I’m getting:

      1. FIRST ERROR at top and down.)

      ERROR: Unknown Punctuation String @ 1
      STR: <?
      SQL: <?php

      // this script will output the queries need to change all fields/tables to a different collation
      // it is HIGHLY suggested you take a MySQL dump prior to running any of the generated
      // this code is provided as is and without any warranty

      set_time_limit(0);<?php

      2. SECOND ERROR at bottom)

      SQL query:

      <?php // this script will output the queries need to change all fields/tables to a different collation // it is HIGHLY suggested you take a MySQL dump prior to running any of the generated // this code is provided as is and without any warranty set_time_limit(0);

      MySQL said: Documentation
      #1064 – You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ‘<?php

      // this script will output the queries need to change all fields/table’ at line 1

    22. Bogdan Says:

      Paulo,

      1. did you really mean to say that your MySQL client is at version 4.1.22, and Server’s Version is 5.0.67? this is weird, and if it is really the case, then it could be causing you miscellaneous troubles.

      2. instead of converting the database, try to set it to utf8_general_ci *before* Joomla installation; if Joomla allows that – ask it to install with utf8 support in DB from the very beginning. If Joomla doesn’t support that – I would suggest that you try Drupal 6.x, it might work perfectly out of the box for you.

      3. the errors you list look like you are trying to feed the PHP script to MySQL…. are you accessing the script with your browser, or just feeding it verbatim to the mysql?

    23. Mike Smullin Says:

      You just saved me SOOO much time! Thank you! :)

    24. jkal Says:

      Hi Bogdan,
      thanks for the script very useful…
      however I found on some fields that are indexed I get an #1283 error, I’m not an sql guru so was wondering if you could add a check to see if the field is indexed and ommit it or make another list of such fields, so one can drop the index of those fields convert it then re-add the index. thanks again.

    25. Bogdan Says:

      Looking at the explanation of error #1283:

      Error: 1283 SQLSTATE: HY000 (ER_BAD_FT_COLUMN)
      Message: Column ‘%s’ cannot be part of FULLTEXT index

      it looks like the problem might be with those specific columns, not with the change of collation.

    26. Ryan Says:

      What if you have to alter the database?

      ALTER DATABASE `dbname` DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci

      Should you first create a new DB and copy the tables to it in this new format?

    27. Bogdan Says:

      database alteration shouldn’t have any effect in newer versions of MySQL (I think since 5.x), because individual column collation settings override database-level collation specification.

      I hope I understood your question correctly.

    28. Ryan Says:

      Ohhh well thats excellent then. Thank you for your response =)

      Wouldn’t any new tables being created would still get created with the old collation if the database wasn’t altered wouldnt it?

      I also once read somewhere that you have to create a completely new database for it to work properly but perhaps that was only versions of MySQL below 5.x

    29. Bogdan Says:

      I guess any new tables created inherit the database’s collation settings, unless those new tables have their own specific collation settings specified. So if you are working on a database structure “on the fly/as you work”, you may want to specify the exact collation you want by default as database’s collation, so that any new tables you create have that collation and not the default latin1_swedish.

    30. Max Says:

      Looking at the explanation of error #1283:

      Error: 1283 SQLSTATE: HY000 (ER_BAD_FT_COLUMN)
      Message: Column ‘%s’ cannot be part of FULLTEXT index

      it looks like the problem might be with those specific columns, not with the change of collation.

      I guess the source of these problem is that ALL columns in fulltext index should have same collation and error appears when we change collation of one of the columns contained in full text index.
      The only solution that I know is to drop index , change collation and recreate index again. But it would be nice if somebody can write the script autоmating this …

    31. Szymon Błąkała Says:

      Great script. Thanks a lot.


      // Does the field default to null, a string, or nothing?
      if ( $row['Null'] == 'YES' && $row['Default'] === NULL )
      $default = " DEFAULT NULL";
      elseif ( $row['Default'] != '' )
      $default = " DEFAULT '".mysql_real_escape_string($row['Default'])."'";
      else
      $default = '';

      will fix null problem

    32. JDS Says:

      Awesome script, thanks. Especially the very smart “output only” aspect of the script so one can review the lines of SQL.

      I have one tiny thing I changed which was a CLI switch. It does not print “< pre >” unless run over HTTP (Line 38):

      if ($_SERVER['HTTP_HOST']){
      print ‘< pre >‘;
      }

      An alternative to this is to just output at “Content-type: text/plain”. That will tell the browser to print newlines.

      Seeya…

    33. Bogdan Says:

      JDS,

      I’ve added text/plain header, that makes this script cleaner indeed.

    34. Dax Says:

      hi all,

      I have a simple way:

      First export DB to file .sql then find “latin1″ and replace all as “utf8″ with your text editor (i’m use text wrangler).
      Second, run this script:

      ALTER DATABASE `your DB name` DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci

      Import file .sql for the final step.

      That ‘s all, very simple. By this way all old data will by change to UTF-8. Have a nice day.

    35. Joaquin Says:

      Hi,

      There’s still an error with some fields that say: “NOT NULL DEFAULT NULL”

      Here’s a fix:

      // Does the field default to null, a string, or nothing?
      if ( $row['Default'] === NULL )
      $default = $row['Null'] == ‘YES’ ? ” DEFAULT NULL ” : “”;
      elseif ( $row['Default'] != ” )
      $default = ” DEFAULT ‘”.mysql_real_escape_string($row['Default']).”‘”;
      else
      $default = ”;

    36. Bogdan Says:

      Joaquin,

      thank you for contributing, I’ve updated the script.

    37. banu Says:

      hi, thx for the script, when i tried, everything work except this error in sql.

      SQL query:

      ALTER TABLE `node_revisions` CHANGE `body` `body` longtext CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL ;

      MySQL said:

      #1283 – Column ‘body’ cannot be part of FULLTEXT index

    38. viroshunt Says:

      take a look at this it may be helpful , it works for me

      change-collation-on-all-tables-and-columns-in-mysql

      Thanks , good work :)

    39. David Says:

      Hi, and THANK YOU SO MUCH!

      I also got the problem “NOT NULL DEFAULT NULL” but did a simple find/replace to get the problem fixed!

      After that everything worked like a charm :)

      Thanks again!

    40. Peter Murray Says:

      Thanks for the script. It helped us to quickly solve a problem we had with a database with multiple collations.

    41. antoh Says:

      Error

      SQL query:

      ALTER TABLE `agenda_categorie` CHANGE `titre` `titre` varchar( 225 ) CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL DEFAULT NULL ;

      MySQL said: Documentation
      #1067 – Invalid default value for ‘titre’

      help plz!!!

    42. Bogdan Says:

      I’ve just updated the script – try it and let me know if that works for you.

      Hopefully, this is the last fix for this annoying problem.

    43. Kawika Says:

      Thanks so much!! This was just the script I needed to fix my Drupal website that was having this problem. Found you through the Drupal issue queues. Now my database is much happier.

    44. Damien Says:

      Dear Bogdan

      Many thanks for your useful script, it has done a perfect job in converting my database from latin1_swedish_ci to utf8_general_ci.

      I was wondering if the generated SQL could be optimized, by grouping the changes for multiple columns to convert in a given table, into a single ALTER TABLE statement.

      The reason I’m asking you this, is that in the DB I just converted there is a large MyISAM table containing a BLOB (1.2 GB data file) as well as 6 varchar columns. Your script generated 7 distinct ALTER TABLE statements (one for the table, plus one for each column to convert), which ran in 8’20, 4’15, 3’52, 4’22, 6’33, 5’10 and 5’11 respectively.

      Thanks again.

    45. Bogdan Says:

      I was wondering if the generated SQL could be optimized, by grouping the changes for multiple columns to convert in a given table, into a single ALTER TABLE statement.

      Yes, it could be done. Looks like the only change needed is merging and comma-separating all the CHANGE statements (according to mysql refman).

      I’ll consider implementing this performance improvement soon.

    46. sherlock Says:

      Indeed, a great script, STILL, it didn’t work for me :(

      I am using MODx CMS and needed such a script to alter my collation to utf8

      The script ran pretty well and here is a short snippet I got on my screen:

      ALTER TABLE `sitename_access_actiondom` DEFAULT CHARACTER SET utf8;
      ALTER TABLE `sitename_access_actiondom` CHANGE `target` `target` varchar(100) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL ;
      ALTER TABLE `sitename_access_actiondom` CHANGE `principal_class` `principal_class` varchar(100) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL DEFAULT 'modPrincipal';
      ALTER TABLE `sitename_access_actions` DEFAULT CHARACTER SET utf8;
      ALTER TABLE `sitename_access_actions` CHANGE `target` `target` varchar(100) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL ;
      ALTER TABLE `sitename_access_actions` CHANGE `principal_class` `principal_class` varchar(100) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL DEFAULT 'modPrincipal';
      .............

      I checked my db in phpMyAdmin and still, all fields' collation have 'latin1_swedish_ci'

      What am i missing here?

    47. Bogdan Says:

      Sherlock,

      the script does not modify the database – this is a precaution.
      You have to run these queries yourself.
      This is to ensure that you take full responsibility for your actions, and have all the backups you need.

      You may use phpmyadmin to run those queries.

    48. David Madl Says:

      Wonderful, huge time saver – thank you!

      David

    49. PHP script to convert mysql collation | PETROS KARIPIDIS Says:

      [...] I also found the following scripts that might be handy: http://www.phpwact.org/php/i18n/utf-8/mysql http://bogdan.org.ua/2008/02/08/convert-mysql-database-from-one-encodingcollation-into-another.html This entry was posted in Scripts and tagged charset, collation, convert, mysql, php, script. [...]

    50. Lenny23 Says:

      Is it possible to convert with this script from utf8_general_ci to utf8_unicode_ci?

    51. Bogdan Says:

      Lenny23: yes. Just set $convert_to = ‘utf8_unicode_ci’.

    52. sofian Says:

      hi Bogdan
      hi evrry one
      please just tell me how to use the script !
      copy the text to were ?
      please tell me
      realy i dont know how to use it

      thank you

    53. Bogdan Says:

      @Sofian,

      copy-paste the script into a PHP file on your web-hosting/server (e.g. “convert.php”).
      Then access the script with your browser (http://your-domain-here/path-to/convert.php), follow the instruction for DB backup and code line removal.
      You will get a set of SQL statements which you can then copy-paste into your DB admin tool (most commonly phpMyAdmin).

      Alternatively, you can run this script from the command line (e.g. using php-cli) directly on the server.

    54. sofian Says:

      sorry
      it not work for me .
      i get download the file .
      and wen paste it , the numbers paste too , i clear them .

      i want to ask you please :
      i have afield on my database that not have colaction , and it wan’t to change to utf8 (by phpmyadmin ) , is your script force it to change ?

      thank you

    55. Bogdan Says:

      to copy-paste: first click the “Plain text” header”

      If there is no Collation for a field, then there is nothing to change – this script will do nothing as well.

    56. sofian Says:

      ok thank you verry much bogdan

    57. sorin Says:

      Please take a look at http://stackoverflow.com/questions/6600534/easiest-way-to-repair-charset-and-collation-on-a-mysql-database because this does include an answer that really works. You current script doesnt work in all cases and even when it does it sets the charset and collation for each column and this is not a good practice, instead it should only set it at database level and reset table and column to defaults (meaning database value). Also the collation you setup is not the recommended one.

    58. Michael Says:

      Your script is still helpful more than 7 years later!!! My shopping cart provider had a bug in their UTF 8 conversion script, which was part of the upgrade process to a newer version of their cart. Their script had the same NOT NULL DEFAULT NULL problem that gave you such a hard time. You’ve provided a marvelous service in making this script available and modifying it based on community feedback!

    59. Bogdan Says:

      I’m glad it is useful :)

      By the way, technically the latest script version is _only_ :D about 5 years old – but yeah, that’s surprisingly long anyway in the modern age of permanent change ;)

    60. SY Says:

      Updated for mysqli:


      query(" SHOW TABLES ") or die($mysqli->error);

      while ($row_tables = $rs_tables->fetch_row()) {
      $table = $mysqli->real_escape_string($row_tables[0]);

      // Alter table collation
      // ALTER TABLE `account` DEFAULT CHARACTER SET utf8
      if ($show_alter_table)
      echo("ALTER TABLE `$table` DEFAULT CHARACTER SET $character_set;\n");

      $rs = $mysqli->query(" SHOW FULL FIELDS FROM `$table` ") or die($mysqli->error);

      while ($row = $rs->fetch_assoc()) {

      if ($row['Collation'] == '' || $row['Collation'] == $convert_to)
      continue;

      // Is the field allowed to be null?
      if ($row['Null'] == 'YES')
      $nullable = ' NULL ';
      else
      $nullable = ' NOT NULL ';

      // Does the field default to null, a string, or nothing?
      if ($row['Default'] === null && $row['Null'] == 'YES')
      $default = " DEFAULT NULL ";
      elseif ($row['Default'] != '')
      $default = " DEFAULT '" . $mysqli->real_escape_string($row['Default']) . "'";
      else
      $default = '';

      // sanity check and fix:
      if ($nullable == ' NOT NULL ' && $default == ' DEFAULT NULL ') {
      $default = '';
      echo "/* Warning: wrong combination of 'default value' and 'NULL-flag' detected - and fixed! */\n";
      echo "/* Diagnostics: row[Null] = '$row[Null]', row[Default] = " . $mysqli->real_escape_string($row['Default']) . ", MySQL version: " . $mysqli->get_server_info() . " */\n";
      }

      // Don't alter INT columns: no collations, and altering them drops autoincrement values
      if (strpos($row['Type'], 'int') !== false) {
      $show_alter_field = false;
      } else {
      $show_alter_field = true;
      }

      // Alter field collation:
      // ALTER TABLE `tab` CHANGE `field` `field` CHAR( 5 ) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL
      if ($show_alter_field) {
      $field = $mysqli->real_escape_string($row['Field']);
      echo "ALTER TABLE `$table` CHANGE `$field` `$field` $row[Type] CHARACTER SET $character_set COLLATE $convert_to $nullable $default;\n";
      }
      }
      }

    Leave a Reply

    XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>