Autarchy of the Private Cave

Tiny bits of bioinformatics, [web-]programming etc

    • Archives

    • Recent comments

    Terminologies for Gene and Protein Similarity

    2nd February 2007

    Note: this is an excerpt (very slightly edited) from the original article Terminologies for Gene & Protein Similarity by Julius H. Jackson.

    Below the definitions of heterologs, homologs, analogs, paralogs, xenologs and orthologs are provided.

    Heterologs: genes that are “unique” in activity and sequence are said to be heterologous. Note that genes initially defined as heterologous by syntax (letter matching) may actually be homologous by activity. In short: Heterologs differ in both origin and activity.

    Homologs: genes that share an arbitrary threshold level of similarity determined by alignment of matching bases. Homology is a qualitative term that describes a relationship between genes and is based upon the quantitative similarity. Similarity is a quantitative term that defines the degree of sequence match between two compared sequences. Homology implies that the compared sequences diverged in evolution from a common origin. In short: Homologs have common origins but may or may not have common activity.

    For example, two aligned genes or segments of sequence that are homologous may have varying degrees of similarity based upon identical base matches in the alignment. In the first sequence alignment in the following figure, the sequences are obviously identical and therefore exhibit 39 matches out of 39 positions aligned, or 100% similarity. In the second alignment the aligned sequences contain 28 matches out of 39 possible. The quantitative match or degree of similarity is then 28/39 or 72%. In both cases the sequences are homologous.

    A

    atgcctgaaggcctattgtttcccagtcgattggctgct…
    ||||||||||||||||||||||||||||||||||||||| 39 of 39 matches
    atgcctgaaggcctattgtttcccagtcgattggctgcg…

    B

    atgcctgaaggcctattgtttcccagtcgattggctgct…
    |||||| |||||| |||||||| |||||| || 28 of 39 matches
    atgccteggcttatattgtatcccagtccattggcagcg…
    Fig. 1

    Homologous sequences are termed homologs and this term may be applied to both genes and proteins. Homologs look similar to each other and appear to share common ancestry but they may or may not display the same activity.

    Analogs: genes or proteins that display the same activity but lack sufficient similarity to imply common origin. The implication is that analogous proteins followed evolutionary pathways from different origins to converge upon the same activity. Thus, analogous genes or proteins are considered a product of convergent evolution. Analogs have homologous activity but heterologous origins. In short: Analogs have common activity but not common origin.

    Paralogs are homologous genes produced by gene duplication. Paralogous genes are homologous genes that result from divergent evolution from a common ancestral gene. Paralogous implies that gene duplication and divergence occurred within the same organism/species and divergence of sequence led to divergence of activity. Paralogs have homologous origin but heterologous activities. In short: Paralogs are homologs produced by gene duplication.

    Orthologs. When speciation follows duplication and one homolog sorts with one species and the other with the other species, subsequent divergence of the duplicated sequence is associated with one or the other species. Such species-specific homologs are termed orthologous. Thus, orthologs are homologs from duplication that precedes speciation, followed by divergence of sequence but not activity in separate species. Orthologs have homologous origin (common ancestor gene) and homologous activity. In short: Orthologs are homologs produced by speciation.

    Xenologs. The determination of whether a gene of interest was recently transferred into the current host by horizontal gene transfer is frequently non-trivial. Occasionally the %G+C content may be so vastly different from the average gene in the current host that a conclusion of external origin is nearly inescapable. Absent such a sore thumb, codon usage bias might provide a clue but interpretation of such data presents challenges, especially in sorting out whether differences are significant and a reflection of the relative state of gene expression or actually a gene from another world. In short: Xenologs are homologs resulting from horizontal gene transfer.

    Note: this is an excerpt (very slightly edited) from the original article Terminologies for Gene & Protein Similarity by Julius H. Jackson.

    Share

    Leave a Reply

    XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>