I was fiddling with PCA and LDA methods and I am stuck at a point, I have a feeling that it is so simple that I can’t see it.
Within-class (SW) and between-class (SB) scatter matrices are defined as:
SW=C∑i=1N∑t=1(xit−μi)(xit−μi)T
SB=C∑i=1N(μi−μ)(μi−μ)T
Total scatter matrix ST is given as:
ST=C∑i=1N∑t=1(xit−μ)(xit−μ)T=SW+SB
where C is number of classes and N is number of samples x are samples, μi is ith class mean, μ is overall mean.
While trying to derive ST I came up to a point where I had:
(x−μi)(μi−μ)T+(μi−μ)(x−μi)T
as a term. This needs to be zero, but why?
Indeed:
ST=C∑i=1N∑t=1(xit−μ)(xit−μ)T=C∑i=1N∑t=1(xit−μi+μi−μ)(xit−μi+μi−μ)T=SW+SB+C∑i=1N∑t=1[(xit−μi)(μi−μ)T+(μi−μ)(xit−μi)T]
Answer
If you assume
1NN∑t=1xit=μi
Then
C∑i=1N∑t=1(xit−μi)(μi−μ)T=C∑i=1(N∑t=1(xit−μi))(μi−μ)T=0
and formula holds. You deal with the second term in the similar way.
Attribution
Source : Link , Question Author : nimcap , Answer Author : mpiktas