Information criteria for choosing best predictive models

29th May 2012

Usually I’m using 10-fold (non-stratified) CV to measure the predictive power of the models: it gives consistent results, and is easy to perform (at least on smaller datasets).

Just came across the Akaikeâ€™s InforÂmaÂtion Criterion (AIC) and Schwarz Bayesian InforÂmaÂtion Criterion (BIC). Citing robjhyndman,

AsympÂtotÂiÂcally, minÂiÂmizÂing the AIC is equivÂaÂlent to minÂiÂmizÂing the CV value. This is true for any model (Stone 1977), not just linÂear modÂels. It is this propÂerty that makes the AIC so useÂful in model selecÂtion when the purÂpose is prediction.
…
Because of the heavÂier penalty, the model choÂsen by BIC is either the same as that choÂsen by AIC, or one with fewer terms. AsympÂtotÂiÂcally, for linÂear modÂels minÂiÂmizÂing BIC is equivÂaÂlent to leaveâ€“vâ€“out cross-â€‹â€‹validation when v = n[1-1/(log(n)-1)] (Shao 1997).

Want to try AIC and maybe BIC on my models. Conveniently, both functions exist in R.

This entry was posted on Tuesday, May 29th, 2012 at 13:44 and is filed under Bioinformatics, Machine learning. You can follow any responses to this entry through the RSS 2.0 feed. You can skip to the end and leave a response. Pinging is currently not allowed.

« The genetics of orchids and dandelions

R functions for regression analysis cheat sheet »

Autarchy of the Private Cave

Tiny bits of bioinformatics, [web-]programming etc

Categories

Related entries

Subscribe

Archives

Recent comments

Meta