R » Autarchy of the Private Cave

GUIs for R

17th October 2013

I’ve tried [briefly] Cantor (which also supports Octave and KAlgebra as backends), rkward, deducer/JGR, R Commander, and RStudio.

My personal choice was RStudio: it is good-looking, intuitive, easy-to-use, while powerful.

Next step would be using some R-equivalent of the excellent ipython’s Mathematica-like Notebook webinterface…

Posted in *nix, Notepad, Programming, Science, Software | No Comments »

R functions for regression analysis cheat sheet

29th May 2012

Original PDF.
My local copy.

Posted in Bioinformatics, Links, Misc | No Comments »

Information criteria for choosing best predictive models

29th May 2012

Usually I’m using 10-fold (non-stratified) CV to measure the predictive power of the models: it gives consistent results, and is easy to perform (at least on smaller datasets).

Just came across the Akaikeâ€™s InforÂmaÂtion Criterion (AIC) and Schwarz Bayesian InforÂmaÂtion Criterion (BIC). Citing robjhyndman,

AsympÂtotÂiÂcally, minÂiÂmizÂing the AIC is equivÂaÂlent to minÂiÂmizÂing the CV value. This is true for any model (Stone 1977), not just linÂear modÂels. It is this propÂerty that makes the AIC so useÂful in model selecÂtion when the purÂpose is prediction.
…
Because of the heavÂier penalty, the model choÂsen by BIC is either the same as that choÂsen by AIC, or one with fewer terms. AsympÂtotÂiÂcally, for linÂear modÂels minÂiÂmizÂing BIC is equivÂaÂlent to leaveâ€“vâ€“out cross-â€‹â€‹validation when v = n[1-1/(log(n)-1)] (Shao 1997).

Want to try AIC and maybe BIC on my models. Conveniently, both functions exist in R.

Posted in Bioinformatics, Machine learning | No Comments »

Batch-retrieve EntrezGene homologs using NCBI’s HomoloGene and R’s annotationTools

27th October 2010

Install the annotationTools R package:
source(“http://bioconductor.org/biocLite.R”)
biocLite(“annotationTools”)
Download full HomoloGene data file from ftp://ftp.ncbi.nlm.nih.gov/pub/HomoloGene/current
library(annotationTools)
homologene = read.delim(“homologene.data”, header=FALSE)
mygenes = read.table(“file with one entrez ID of the source organism per line.txt”)
getHOMOLOG(unlist(mygenes), taxonomy_ID_of_target_organism, homologene) [alternatively, wrap the call to getHOMOLOG into unlist to get a vector]

It might be easier to achieve the same results with a Perl script calling NCBI’s e-utils.

Posted in Bioinformatics, how-to, Notepad | 2 Comments »

R tutorial links

29th March 2010

R time series tutorial (2010, a website of the “Time Series Analysis and Its Applications: With R Examples” book)
Statistics with R (2007)
R for programmers PDF (2008, 104 pages, linked to from here)
Brief R tutorial (2004)
Statistical computing with R: a tutorial (2004)
An introduction to R (from the official r-project website, should be always up-to-date)
R tutorial (date unknown, definitely newer than 2005)

Posted in Bioinformatics, Links, Science, Systems Biology | 1 Comment »

R script to filter probesets with log-expression values below the lowest spike-in

27th January 2010

Sometimes there is a need to remove all the probesets, which have expression values below the minimal spike-in intensity on the Affymetrix microarray. The reasoning behind this procedure is simple: minimal-expression spike-ins represent the bottom margin of microarray sensitivity, and anything below that margin cannot be reliably quantified – which also means that both fold-change and p-value of expression variance will be unreliable for these probesets.

Here’s a simple R script to do just that. It is abundantly commented, and also contains an optional (commented out) fragment which allows the removal of more low-variance, low-intensity probesets.

Read the rest of this entry »

Posted in Bioinformatics, Programming, Science | No Comments »

R under Debian testing/i386: permanently set pdfviewer option

21st October 2009

If you get this message when opening vignettes:

Error in openPDF(vif) :
getOption(‘pdfviewer’) is ”; please use ‘options(pdfviewer=…)’

and you are tired of running this command every time:

> options(pdfviewer=”okular”)

then you should check if your system-wide Renviron file has proper PDF viewer set:
Read the rest of this entry »

Posted in *nix, how-to, Notepad, Software | No Comments »

« Previous Entries

Autarchy of the Private Cave

Tiny bits of bioinformatics, [web-]programming etc

Categories

Subscribe

Archives

Recent comments

Meta