Autarchy of the Private Cave

Tiny bits of bioinformatics, [web-]programming etc

    Batch-retrieve EntrezGene homologs using NCBI’s HomoloGene and R’s annotationTools

    27th October 2010

    1. Install the annotationTools R package:
      source(“http://bioconductor.org/biocLite.R”)
      biocLite(“annotationTools”)
    2. Download full HomoloGene data file from ftp://ftp.ncbi.nlm.nih.gov/pub/HomoloGene/current
    3. library(annotationTools)
    4. homologene = read.delim(“homologene.data”, header=FALSE)
    5. mygenes = read.table(“file with one entrez ID of the source organism per line.txt”)
    6. getHOMOLOG(unlist(mygenes), taxonomy_ID_of_target_organism, homologene) [alternatively, wrap the call to getHOMOLOG into unlist to get a vector]

    It might be easier to achieve the same results with a Perl script calling NCBI’s e-utils.

    Posted in Bioinformatics, how-to, Notepad | 1 Comment »

    R tutorial links

    29th March 2010

    Posted in Bioinformatics, Links, Science, Systems Biology | 1 Comment »

    R script to filter probesets with log-expression values below the lowest spike-in

    27th January 2010

    Sometimes there is a need to remove all the probesets, which have expression values below the minimal spike-in intensity on the Affymetrix microarray. The reasoning behind this procedure is simple: minimal-expression spike-ins represent the bottom margin of microarray sensitivity, and anything below that margin cannot be reliably quantified – which also means that both fold-change and p-value of expression variance will be unreliable for these probesets.

    Here’s a simple R script to do just that. It is abundantly commented, and also contains an optional (commented out) fragment which allows the removal of more low-variance, low-intensity probesets.

    Read the rest of this entry »

    Posted in Bioinformatics, Programming, Science | No Comments »

    R under Debian testing/i386: permanently set pdfviewer option

    21st October 2009

    If you get this message when opening vignettes:

    Error in openPDF(vif) :
    getOption(‘pdfviewer’) is ”; please use ‘options(pdfviewer=…)’

    and you are tired of running this command every time:

    > options(pdfviewer=”okular”)

    then you should check if your system-wide Renviron file has proper PDF viewer set:
    Read the rest of this entry »

    Posted in *nix, how-to, Notepad, Software | No Comments »

    How to create custom Affymetrix CDF file

    23rd March 2009

    First, learn about custom CDFs and why they are needed.

    The aroma.affymetrix R package google group has a how-to: create a CDF annotation file from scratch.

    Also useful: how to convert CDF into an R package, which has all CDF data available (as a PDF with more details).

    Posted in how-to, Links, Science | No Comments »