# The linearity of variance

I think the following two formulas are true:

while a is a constant number
if $X$, $Y$ are independent

However, I am not sure what is wrong with the below:

which does not equal to $2^2 \mathrm{Var}(X)$, i.e. $4\mathrm{Var}(X)$.

If it is assumed that $X$ is the sample taken from a population, I think we can always assume $X$ to be independent from the other $X$s.

So what is wrong with my confusion?

$\DeclareMathOperator{\Cov}{Cov}$
$\DeclareMathOperator{\Corr}{Corr}$
$\DeclareMathOperator{\Var}{Var}$

The problem with your line of reasoning is

“I think we can always assume $X$ to be independent from the other $X$s.”

$X$ is not independent of $X$. The symbol $X$ is being used to refer to the same random variable here. Once you know the value of the first $X$ to appear in your formula, this also fixes the value of the second $X$ to appear. If you want them to refer to distinct (and potentially independent) random variables, you need to denote them with different letters (e.g. $X$ and $Y$) or using subscripts (e.g. $X_1$ and $X_2$); the latter is often (but not always) used to denote variables drawn from the same distribution.

If two variables $X$ and $Y$ are independent then $\Pr(X=a|Y=b)$ is the same as $\Pr(X=a)$: knowing the value of $Y$ does not give us any additional information about the value of $X$. But $\Pr(X=a|X=b)$ is $1$ if $a=b$ and $0$ otherwise: knowing the value of $X$ gives you complete information about the value of $X$. [You can replace the probabilities in this paragraph by cumulative distribution functions, or where appropriate, probability density functions, to essentially the same effect.]

Another way of seeing things is that if two variables are independent then they have zero correlation (though zero correlation does not imply independence!) but $X$ is perfectly correlated with itself, $\Corr(X,X)=1$ so $X$ can’t be independent of itself. Note that since the covariance is given by $\Cov(X,Y)=\Corr(X,Y)\sqrt{\Var(X)\Var(Y)}$, then

The more general formula for the variance of a sum of two random variables is

In particular, $\Cov(X,X) = \Var(X)$, so

which is the same as you would have deduced from applying the rule

If you are interested in linearity, then you might be interested in the bilinearity of covariance. For random variables $W$, $X$, $Y$ and $Z$ (whether dependent or independent) and constants $a$, $b$, $c$ and $d$ we have

and overall,

You can then use this to prove the (non-linear) results for variance that you wrote in your post:

The latter gives, as a special case when $a=b=1$,

When $X$ and $Y$ are uncorrelated (which includes the case where they are independent), then this reduces to $\Var(X+Y) = \Var(X) + \Var(Y)$.
So if you want to manipulate variances in a “linear” way (which is often a nice way to work algebraically), then work with the covariances instead, and exploit their bilinearity.