I am quite new to statistics, so please forgive me for using probably the wrong vocabulary.

I have some data that looks (to me) like a gaussian when plotted.

The data is an extract from a jpeg image. It’s a vertical line taken from the image, and only the Red data is used (from RGB).

Here is the full data (27 data points):

`> r [1] 0.003921569 0.031372549 0.023529412 0.015686275 0.003921569 0.027450980 [7] 0.003921569 0.015686275 0.031372549 0.105882353 0.305882353 0.490196078 [13] 0.560784314 0.615686275 0.592156863 0.505882353 0.364705882 0.227450980 [19] 0.050980392 0.031372549 0.019607843 0.054901961 0.031372549 0.015686275 [25] 0.027450980 0.003921569 0.011764706 > dput(r) c(0.00392156862745098, 0.0313725490196078, 0.0235294117647059, 0.0156862745098039, 0.00392156862745098, 0.0274509803921569, 0.00392156862745098, 0.0156862745098039, 0.0313725490196078, 0.105882352941176, 0.305882352941176, 0.490196078431373, 0.56078431372549, 0.615686274509804, 0.592156862745098, 0.505882352941176, 0.364705882352941, 0.227450980392157, 0.0509803921568627, 0.0313725490196078, 0.0196078431372549, 0.0549019607843137, 0.0313725490196078, 0.0156862745098039, 0.0274509803921569, 0.00392156862745098, 0.0117647058823529) plot(r)`

I would like to find a gaussian that is as close as possible to the plot/data.

I tried with normalmixEM from the R package mixtools.

`> fit = normalmixEM(r)`

but this seems to try to fit to a mix of two gaussian by default.

I tried to specify that there is only one gaussian using the parameter k:

`> fit = normalmixEM(r, k = 1) Error in normalmix.init(x = x, lambda = lambda, mu = mu, s = sigma, k = k, : arbmean and arbvar cannot both be FALSE`

How can I fit the data?

**Answer**

There’s a difference between fitting a gaussian *distribution* and fitting a gaussian *density curve*. What `normalmixEM`

is doing is the former. What you want is (I guess) the latter.

Fitting a distribution is, roughly speaking, what you’d do if you made a histogram of your data, and tried to see what sort of shape it had. What you’re doing, instead, is simply plotting a curve. That curve happens to have a hump in the middle, like what you get by plotting a gaussian density function.

To get what you want, you can use something like `optim`

to fit the curve to your data. The following code will use nonlinear least-squares to find the three parameters giving the best-fitting gaussian curve: `m`

is the gaussian mean, `s`

is the standard deviation, and `k`

is an arbitrary scaling parameter (since the gaussian density is constrained to integrate to 1, whereas your data isn’t).

```
x <- seq_along(r)
f <- function(par)
{
m <- par[1]
sd <- par[2]
k <- par[3]
rhat <- k * exp(-0.5 * ((x - m)/sd)^2)
sum((r - rhat)^2)
}
optim(c(15, 2, 1), f, method="BFGS", control=list(reltol=1e-9))
```

**Attribution***Source : Link , Question Author : Timothée HENRY , Answer Author : Hong Ooi*