AICとr2の比較 - A way of thinking

研究室用にRでコードを書いてみたので，とりあえず公開しておきます。
（英語に自信はないです：伝わないレベルではないはず・・・）

####
## Illustrating how AIC and r2 work

# Step 1
# Case that we are comparing 3 models with the SAME number of variables
# Pretend you want to compare 3 models with 2 explanatory variables

# Generate three explanatory variables
# Pretend we have 50 samples
n.data <- 50
set.seed(11) # Just to reproduce the same result every time
x1 <- rnorm(n.data, 0, 1)
x2 <- rpois(n.data, 10)
# x2 <- x1 + runif(n.data, 0.5, 1) # if you run you can see AIC is not working in Case 2
x3 <- runif(n.data, 0, 10)
# These are just examples and you can set as you want

# Generate one response variable
y <- 2*x1 + 10*x3 + rnorm(n.data, 0, 2)
# Note that this assumes x2 does not affect y

# Pretend we want to compare modelA (x1 & x2), modelB (x1 & x3), and modelC (x2 & x3)
## Model A
modelA <- lm(y ~ x1 + x2)
summary(modelA)
# Extract r2 for model A
(summary(modelA))$r.squared
# Extract AIC for model A
AIC(modelA)

## Model B
modelB <- lm(y ~ x1 + x3)
summary(modelB)
# Extract r2 for model B
(summary(modelB))$r.squared
# Extract AIC for model B
AIC(modelB)

## Model C
modelC <- lm(y ~ x2 + x3)
summary(modelC)
# Extract r2 for model C
(summary(modelC))$r.squared
# Extract AIC for model B
AIC(modelC)

## Compare r2 and AIC values of 3 models
# r2: higher is better
(summary(modelA))$r.squared
(summary(modelB))$r.squared
(summary(modelC))$r.squared
# AIC: smaller is better
AIC(modelA, modelB, modelC)

## Summary (Case 1)
# Even though the r2 difference between model B and model C was not large
# r2 could indicate model B is better (better fit)
# The similar results could be obtained from results using AIC

##### Case 2
# Let's comapre model B and model D (x1, x2, and x3)
# The point is here that we want to compare models with different number of explanatory variables

## Model D
modelD <- lm(y ~ x1 + x2 + x3)
summary(modelD)
# Extract r2 for model D
(summary(modelD))$r.squared
# Extract AIC for model D
AIC(modelD)

## Compare r2 and AIC values of the models
# r2: higher is better
(summary(modelB))$r.squared
(summary(modelD))$r.squared
# indicated model D had the higher r2 value

# AIC: smaller is better
AIC(modelB, modelD)
# indicated model D had the smaller AIC value
# data gave more support to model B
## Summary
# AIC suggest model B is better although r2 of model D is higher
# This example demonstrates that r2 increases if the number of explanatory variables increases
# If you are relying only on r2 values (, r or RSS),
# you will feel how many variables we need to put in the model.

# One of the solutions is to use AIC.
# Then, you may wonder if there is any threshold value for AIC to say the model B is better than C?
# Ok, see Burnham, K.P., Anderson, D.R. & Huyvaert, K.P. (2011) AIC model selection and multimodel inference in behavioral ecology: some background, observations, and comparisons. Behavioral Ecology and Sociobiology, 65, 23–35.
# Note, there are many other information criteria like AIC (say, BIC, DIC)
# and some other ways to compare the model (say, cross validation)

# AIC does not always work well if explanatory variables you have are correlated.
# In that case, you will need to think more and may be use other regressions
# such as Principal compornet regressions, PLS regression, and ridge regression (ask Sui-san)

# Note that do not forget what you want to indicate by using statistical analysis
# Statistical analysis is nothing more than a tool.
# For example, you may need to know more about your field and background
# or to do experiments again.

# Good luck