I was fiddling with PCA and LDA methods and I am stuck at a point, I have a feeling that it is so simple that I can’t see it.

Within-class (SW) and between-class (SB) scatter matrices are defined as:

SW=C∑i=1N∑t=1(xit−μi)(xit−μi)T

SB=C∑i=1N(μi−μ)(μi−μ)T

Total scatter matrix ST is given as:

ST=C∑i=1N∑t=1(xit−μ)(xit−μ)T=SW+SB

where C is number of classes and N is number of samples x are samples, μi is ith class mean, μ is overall mean.

While trying to derive ST I came up to a point where I had:

(x−μi)(μi−μ)T+(μi−μ)(x−μi)T

as a term. This needs to be zero, but why?

Indeed:

ST=C∑i=1N∑t=1(xit−μ)(xit−μ)T=C∑i=1N∑t=1(xit−μi+μi−μ)(xit−μi+μi−μ)T=SW+SB+C∑i=1N∑t=1[(xit−μi)(μi−μ)T+(μi−μ)(xit−μi)T]

**Answer**

If you assume

1NN∑t=1xit=μi

Then

C∑i=1N∑t=1(xit−μi)(μi−μ)T=C∑i=1(N∑t=1(xit−μi))(μi−μ)T=0

and formula holds. You deal with the second term in the similar way.

**Attribution***Source : Link , Question Author : nimcap , Answer Author : mpiktas*