# Regression when each point has its own uncertainty in both xx and yy

I made $n$ measurements of two variables $x$ and $y$. They both have known uncertainties $\sigma_x$ and $\sigma_y$ associated with them. I want to find the relation between $x$ and $y$. How can I do it?

EDIT: each $x_i$ has a different $\sigma_{x,i}$ associated with it, and the same with the $y_i$.

Reproducible R example:

## pick some real x and y values
true_x <- 1:100
true_y <- 2*true_x+1

## pick the uncertainty on them
sigma_x <- runif(length(true_x), 1, 10) # 10
sigma_y <- runif(length(true_y), 1, 15) # 15

## perturb both x and y with noise
noisy_x <- rnorm(length(true_x), true_x, sigma_x)
noisy_y <- rnorm(length(true_y), true_y, sigma_y)

## make a plot
plot(NA, xlab="x", ylab="y",
xlim=range(noisy_x-sigma_x, noisy_x+sigma_x),
ylim=range(noisy_y-sigma_y, noisy_y+sigma_y))
arrows(noisy_x, noisy_y-sigma_y,
noisy_x, noisy_y+sigma_y,
length=0, angle=90, code=3, col="darkgray")
arrows(noisy_x-sigma_x, noisy_y,
noisy_x+sigma_x, noisy_y,
length=0, angle=90, code=3, col="darkgray")
points(noisy_y ~ noisy_x)

## fit a line
mdl <- lm(noisy_y ~ noisy_x)
abline(mdl)

## show confidence interval around line
newXs <- seq(-100, 200, 1)
prd <- predict(mdl, newdata=data.frame(noisy_x=newXs),
interval=c('confidence'), level=0.99, type='response')
lines(newXs, prd[,2], col='black', lty=3)
lines(newXs, prd[,3], col='black', lty=3)


The problem with this example is that I think it assumes that there are no uncertainties in $x$. How can I fix this?

Let the true line $L$, given by an angle $\theta$ and a value $\gamma$, be the set

The signed distance between any point $(x,y)$ and this line is

Letting the variance of $x_i$ be $\sigma_i^2$ and that of $y_i$ be $\tau_i^2$, independence of $x_i$ and $y_i$ implies the variance of this distance is

Let us therefore find $\theta$ and $\gamma$ for which the inverse variance weighted sum of squared distances is as small as possible: it will be the maximum likelihood solution if we assume the errors have bivariate normal distributions. This requires a numerical solution, but it’s straightforward to find a with a few Newton-Raphson steps beginning with a value suggested by an ordinary least-squares fit.

Simulations suggest this solution is good even with small amounts of data and relatively large values of $\sigma_i$ and $\tau_i$. You can, of course, obtain standard errors for the parameters in the usual ways. If you’re interested in the standard error of the position of the line, as well as the slope, then you might wish first to center both variables at $0$: that should eliminate almost all the correlation between the estimates of the two parameters.

The method works so well with the example of the question that the fitted line is almost distinguishable from the true line in the plot: they are within one unit or so of each other everywhere. Instead, in this example the $\tau_i$ are drawn iid from an exponential distribution and the $\sigma_i$ are drawn iid from an exponential distribution with twice the scale (so that most of the error tends to occur in the $x$ coordinate). There are only $n=8$ points, a small number. The true points are equally spaced along the line with unit spacing. This is a fairly severe test, because the potential errors are noticeable compared to the range of the points.

The true line is shown in dotted blue. Along it the original points are plotted as hollow circles. Gray arrows connect them to the observed points, plotted as solid black disks. The solution is drawn as a solid red line. Despite the presence of large deviations between observed and actual values, the solution is remarkably close to the correct line within this region.

#
# Generate data.
#
theta <- c(1, -2, 3) # The line is theta %*% c(x,y,-1) == 0
theta[-3] <- theta[-3]/sqrt(crossprod(theta[-3]))
n <- 8
set.seed(17)
sigma <- rexp(n, 1/2)
tau <- rexp(n, 1)
u <- 1:n
xy.0 <- t(outer(c(-theta[2], theta[1]), 0:(n-1)) + c(theta[3]/theta[1], 0))
xy <- xy.0 + cbind(rnorm(n, sd=sigma), rnorm(n, sd=tau))
#
# Fit a line.
#
x <- xy[, 1]
y <- xy[, 2]
f <- function(phi) { # Negative log likelihood, up to an additive constant
a <- phi[1]
gamma <- phi[2]
sum((x*cos(a) + y*sin(a) - gamma)^2 / ((sigma*cos(a))^2 + (tau*sin(a))^2))/2
}
fit <- lm(y ~ x) # Yields starting estimates
slope <- coef(fit)[2]
theta.0 <- atan2(1, -slope)
gamma.0 <- coef(fit)[1] / sqrt(1 + slope^2)
sol <- nlm(f,c(theta.0, gamma.0))
#
# Plot the data and the fit.
#
theta.hat <- sol$estimate[1] %% (2*pi) gamma.hat <- sol$estimate[2]
plot(rbind(xy.0, xy), type="n", xlab="x", ylab="y")
invisible(sapply(1:n, function(i)
arrows(xy.0[i,1], xy.0[i,2], xy[i,1], xy[i,2],
length=0.15, angle=20, col="Gray")))
points(xy.0)
points(xy, pch=16)
abline(c(theta[3] / theta[2], -theta[1]/theta[2]), col="Blue", lwd=2, lty=3)
abline(c(gamma.hat / sin(theta.hat), -1/tan(theta.hat)), col="Red", lwd=2)