Weird orthology species names in Ensembl
30th September 2008
For the COTRASIF tool, I’ve been using the Ensembl Compara database (since release 47) to automatically import into COTRASIF gene orthology mappings.
However, with the E!50 release, the Compara database was dropped.
Looking for another option to get orthologs from Ensembl (using martservice, via biomart.org), I tried using the standard query – selecting “Homologs” group on the “Attributes” page for a single species database, and then selecting appropriate second species to get orthology mappings.
Imagine my surprise, when not only in the interface, but also in the generated XML file I found attribute names like “cow_ensembl_gene” :-O
I only need 11 species at the moment, and excluding the sufficiently unique name mappings like “zebrafish – danio rerio”, there is a number of questionable mappings: “yeast” for S. cerevisiae (could be S.pombe), “rat” for R. norvegicus (could be R.rattus), “anopheles” for A.gambiae (could be some other Anopheles). Other mappings might be also non-unique, especially for people working with different species of the same genus.
Am I missing some system in this naming “convention”, or am I the only one who finds it strange?
Is there a way not to use “common species names” when importing orthology data from Ensembl with the help of martservice?