Autarchy of the Private Cave

Science, Society, Programming and Hobbies

  • Exits

  • Categories

  • Archives

  • Visitors' track

    Locations of visitors to this page
  • Tags list

  • Earn and spend


  • Exits

  • Ratings

    Science Blogs - Blog Top Sites website monitoring service
  • Weird orthology species names in Ensembl

    30th September 2008

    For the COTRASIF tool, I’ve been using the Ensembl Compara database (since release 47) to automatically import into COTRASIF gene orthology mappings.

    However, with the E!50 release, the Compara database was dropped.

    Looking for another option to get orthologs from Ensembl (using martservice, via biomart.org), I tried using the standard query - selecting “Homologs” group on the “Attributes” page for a single species database, and then selecting appropriate second species to get orthology mappings.

    Imagine my surprise, when not only in the interface, but also in the generated XML file I found attribute names like “cow_ensembl_gene” :-O

    I only need 11 species at the moment, and excluding the sufficiently unique name mappings like “zebrafish - danio rerio”, there is a number of questionable mappings: “yeast” for S. cerevisiae (could be S.pombe), “rat” for R. norvegicus (could be R.rattus), “anopheles” for A.gambiae (could be some other Anopheles). Other mappings might be also non-unique, especially for people working with different species of the same genus.

    Am I missing some system in this naming “convention”, or am I the only one who finds it strange?

    Is there a way not to use “common species names” when importing orthology data from Ensembl with the help of martservice?

    Posted in Bioinformatics, Science | No Comments »

    Convert MySQL database from one encoding/collation into another

    8th February 2008

    Most frequent use: convert database from latin1_swedish to utf8_general_ci.
    Original script found at: MySQL and UTF-8.

    Update: the original script had an error, it would generate queries likes this one (note the bold part):

    ALTER TABLE `links` CHANGE `link_rel` `link_rel` varchar(255) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL DEFAULT NULL;

    This is clearly wrong syntax (and logic). I fixed this by making comparison to NULL strict (with three equal signs instead of two):

    // Does the field default to null, a string, or nothing?
    if ($row['Default'] === NULL)

    Update 2: based on comment by banesto, I modified the script; now it does not require specifying the from_collation, it’s sufficient to specify to_collation (which will be used for all the fields and tables). The modified code is:

    if ($row['Collation'] == ” || $row['Collation'] == $convert_to)
    continue;

    Here’s the script itself: (to copy-paste: first click the “Plain text” header)
    Read the rest of this entry »

    Posted in Links, Notepad, PHP, Programming, Web | 24 Comments »

     
    Close
    E-mail It