Batch-retrieve EntrezGene homologs using NCBI’s HomoloGene and R’s annotationTools
27th October 2010
- Install the annotationTools R package:
source(“http://bioconductor.org/biocLite.R”)
biocLite(“annotationTools”) - Download full HomoloGene data file from ftp://ftp.ncbi.nlm.nih.gov/pub/HomoloGene/current
- library(annotationTools)
- homologene = read.delim(“homologene.data”, header=FALSE)
- mygenes = read.table(“file with one entrez ID of the source organism per line.txt”)
- getHOMOLOG(unlist(mygenes), taxonomy_ID_of_target_organism, homologene) [alternatively, wrap the call to getHOMOLOG into unlist to get a vector]
It might be easier to achieve the same results with a Perl script calling NCBI’s e-utils.
March 28th, 2011 at 16:48
Hello,
Any idea on how to retrieve full homologenes (with multiple species) using several genebank IDs as input?
Best regards,
Yvan
December 18th, 2013 at 6:32
[…] Batch-retrieval via R’s annotationTools […]