# Negative values in predictions for an always-positive response variable in linear regression

I’m trying to predict a response variable in linear regression that should be always positive (cost per click). It’s a monetary amount. In adwords, you pay google for clicks on your ads, and a negative number would mean that google pays you when people clicked 😛

The predictors are all continuous values. The Rsquared and RMSE are decent when compared to other models, even out-of-sample:

  RMSE        Rsquared
1.4141477     0.8207303


I cannot rescale the predictions, because it’s money, so even a small rescaling factor could change costs significantly.

As far as I understand, for the regression model there’s nothing special about zero and negative numbers, so it finds the best regression hyperplane no matter whether the output is partly negative.

This is a very first attempt, using all variables I have. So there’s room for refinement.

Is there any way to tell the model that the output cannot be negative?

I assume that you are using the OLS estimator on this linear regression model. You can use the inequality constrained least-squares estimator, which will be the solution to a minimization problem under inequality constraints. Using standard matrix notation (vectors are column vectors) the minimization problem is stated as

…where $\mathbf y$ is $n \times 1$ , $\mathbf X$ is $n\times k$, $\beta$ is $k\times 1$ and $\mathbf Z$ is the $m \times k$ matrix containing the out-of-sample regressor series of length $m$ that are used for prediction. We have $m$ linear inequality constraints (and the objective function is convex, so the first order conditions are sufficient for a minimum).

The Lagrangean of this problem is

where $\lambda$ is a $m \times 1$ column vector of non-negative Karush -Kuhn -Tucker multipliers. The first order conditions are (you may want to review rules for matrix and vector differentiation)

…where $\xi = \frac 12 \lambda$, for convenience, and $\hat \beta_{OLS}$ is the estimator we would obtain from ordinary least squares estimation.

The method is fully elaborated in Liew (1976).