# Probability distribution for a noisy sine wave

I’m looking to analytically calculate a probability distribution of sampling
points from an oscillating function when there is some measurement error. I have
already calculated the probability distribution for the “without noise” part (I will put this at the end), but I can’t figure out how to include “noise”.

## Numerical estimate

To be clearer, imagine there is some function $y(x) = \sin(x)$ which you randomly pick points from during a single cycle; if you bin the points in a histogram you will get something related to the distribution.

### Without noise

For example here is the $sin(x)$ and the corresponding histogram ### With noise

Now if there is some measurement error then it will change the shape of the histogram (and hence I think the underlying distribution). For example ## Analytic Calculation

So hopefully I’ve convinced you there is some difference between the two, now I will write out how I calculated the “without noise” case:

### Without noise

Then if the times at which we sample are uniformly distributed then the probability distribution for $y$ must satisfy:

then since

and so

which with appropriate normalisation fits the histogram generated in the “no noise” case.

### With noise

So my question is: how can I analytically include noise in the distribution? I think it is something like combining the distributions in a clever way, or including noise in the definition of $y(x)$, but I’m out of ideas and ways to move forwards so any hints/tips or even recommended reading will be much appreciated.

It depends on how the noise process is structured.

Assuming that I’ve understood your situation correctly, if the noise is additive, independent and identically distributed, you would just take the convolution of the noise density with the density of $Y$.

If $X_i$ is random uniform over a cycle, your noiseless process conditional on $x$ is $Y_i|X_i=x_i$, which is degenerate, with mean $\sin(x_i)$ and variance 0. The marginal distribution of $Y$ is a uniform mixture of those degenerate distributions; it looks like you have worked that distribution out correctly; let’s call that density $g$.

If, for example your noise is $\epsilon_i\sim N(0,\sigma^2)$, which is to say $f(\epsilon)=\frac{1}{\sqrt{2\pi}\,\sigma}\exp({\frac{\epsilon^2}{2\sigma^2}})$, then $f*g$ is the density of the sum of the noise with that uniform mixture of noiseless variables.

$f_{Y+\epsilon}(z) = (f*g)(z) = \int_{-\infty}^{\infty} f_{Y}(y)f_{\epsilon}(z-y)dy=\int_{-\infty}^{\infty} f_{Y}(z-w)f_{\epsilon}(w)dw$ (this convolution was done numerically; I don’t know how tractable that integral is in this example, because I didn’t attempt it.)