# Is it possible for R2R^2 of a regression on two variables be higher than the sum of R2R^2 for two regressions on the individual variables?

In OLS, is it possible for the $R^2$ of a regression on two variables be higher than the sum of $R^2$ for two regressions on the individual variables.

$R^2(Y \sim A + B) > R^2(Y \sim A) + R^2(Y \sim B)$

Edit: Ugh, this is trivial; that’s what I get for trying to problems issues that I thought of while at the gym. Sorry for wasting time again. The answer is clearly yes.

$Y \sim N(0,1)$

$A \sim N(0,1)$

$B = Y - A$

$R^2(Y \sim A + B) = 1$, clearly. But $R^2(Y \sim A)$ should be 0 in the limit and $R^2 (Y \sim B)$ should be 0.5 in the limit.

Here’s a little bit of R that sets a random seed that will result in a dataset that shows it in action.

set.seed(103)

d <- data.frame(y=rnorm(20, 0, 1),
a=rnorm(20, 0, 1),
b=rnorm(20, 0, 1))

m1 <- lm(y~a, data=d)
m2 <- lm(y~b, data=d)
m3 <- lm(y~a+b, data=d)

r2.a <- summary(m1)[["r.squared"]]
r2.b <- summary(m2)[["r.squared"]]
r2.sum <- summary(m3)[["r.squared"]]

r2.sum > r2.a + r2.b


Not only is it possible (as you’ve already shown analytically) it’s not hard to do. Given 3 normally distributed variables, it seems to happen about 40% of the time.