I am trying to understand whether discrete Fourier transform gives the same representation of a curve as a regression using Fourier basis. For example,

`library(fda) Y=daily$tempav[,1] ## my data length(Y) ## =365 ## create Fourier basis and estimate the coefficients mybasis=create.fourier.basis(c(0,365),365) basisMat=eval.basis(1:365,mybasis) regcoef=coef(lm(Y~basisMat-1)) ## using Fourier transform fftcoef=fft(Y) ## compare head(fftcoef) head(regcoef)`

FFT gives a complex number, whereas regression gives a real number.

Do they convey the same information? Is there a one to one map between the two sets of numbers?

(I would appreciate if the answer is written from the statistician’s perspective instead of the engineer’s perspective. Many online materials I can find have engineering jargon all over the place, which makes them less palatable to me.)

**Answer**

They’re the same. Here’s how…

# Doing a Regression

Say you fit the model

yt=n∑j=1Ajcos(2πt[j/N]+ϕj)

where t=1,…,N and n=floor(N/2). This isn’t suitable for linear regression, though, so instead you use some trigonometry ( cos(a+b)=cos(a)cos(b)−sin(a)sin(b)) and fit the equivalent model:

yt=n∑j=1β1,jcos(2πt[j/N])+β2,jsin(2πt[j/N]).

Running linear regression on all of the Fourier frequencies {j/N:j=1,…,n} gives you a bunch (2n) of betas: {ˆβi,j}, i=1,2. For any j, if you wanted to calculate the pair by hand, you could use:

ˆβ1,j=∑Nt=1ytcos(2πt[j/N])∑Nt=1cos2(2πt[j/N])

and

ˆβ2,j=∑Nt=1ytsin(2πt[j/N])∑Nt=1sin2(2πt[j/N]).

These are standard regression formulas.

# Doing a Discrete Fourier Transform

When you run a Fourier transform, you calculate, for j=1,…,n:

d(j/N)=N−1/2N∑t=1ytexp[−2πit[j/N]]=N−1/2(N∑t=1ytcos(2πt[j/N])−iN∑t=1ytsin(2πt[j/N])).

This is a complex number (notice the i). To see why that equality holds, keep in mind that eix=cos(x)+isin(x), cos(−x)=cos(x) and sin(−x)=−sin(x).

For each j, taking the square of the complex conjugate gives you the “**periodogram**:”

|d(j/N)|2=N−1(N∑t=1ytcos(2πt[j/N]))2+N−1(N∑t=1ytsin(2πt[j/N]))2.

In R, calculating this vector would be `I <- abs(fft(Y))^2/length(Y)`

, which is sort of weird, because you have to scale it.

Also you can define the “**scaled periodogram**”

P(j/N)=(2NN∑t=1ytcos(2πt[j/N]))2+(2NN∑t=1ytsin(2πt[j/N]))2.

Clearly P(j/N)=4N|d(j/N)|2. In R this would be `P <- (4/length(Y))*I[(1:floor(length(Y)/2))]`

.

# The Connection Between the Two

It turns out the connection between the regression and the two periodograms is:

P(j/N)=ˆβ21,j+ˆβ22,j.

Why? Because the basis you chose is orthogonal/orthonormal. You can show for each j that ∑Nt=1cos2(2πt[j/N])=∑Nt=1sin2(2πt[j/N])=N/2. Plug that in to the denominators of your formulas for the regression coefficients and voila.

Source:

https://www.amazon.com/Time-Analysis-Its-Applications-Statistics/dp/144197864X

**Attribution***Source : Link , Question Author : qoheleth , Answer Author : Taylor*