Does anybody know the meaning of average partial effects? What exactly is it and how can I calculate them? Here is a reference that might help.

**Answer**

I don’t think there is a consensus on terminology here, but the following is what I think most people have in mind when someone says “average partial effect” or “average marginal effect”.

Suppose, for concreteness, that we are analyzing a population of people. Consider the linear model

Y=βX+U,

where (Y,X) are observed scalar random variables, and U is an unobserved scalar random variable. Suppose that β is an unknown constant. Suppose this is a structural model, meaning that it has a causal interpretation. So, if we could pick a person out of the population and increase their value of X by 1 unit, then their value of Y would increase by β. Then β is called the *marginal* or *causal* effect of X on Y.

Now, assuming that β is a constant means that no matter which person we pick out of the population, a one unit increase in X has the same effect on Y — it increases Y by β. This is clearly restrictive. We can relax this constant effect assumption by supposing that β itself a random variable — each person has a different value of β. Consequently, there is an entire distribution of marginal effects, the distribution of β. The mean of this distribution, E(β), is called the *average marginal effect* (AME), or average partial effect. If we were to increase everyone’s value of X by one unit, then the average change in Y is given by the AME.

Alternatively, consider the nonlinear model

Y=m(X,U),

where again (Y,X) are scalar observables and U is a scalar unobservable, and m is some unknown function (assume it is differentiable for simplicity). Here the causal/marginal effect of X on Y is ∂m(x,u)/∂x. This value may depend on the value of U. Thus, even if we look at people who all have the same observed value of X, a small increase in X will not necessarily increase Y by the same amount, because each person may have a different value of U. Thus there is a distribution of marginal effects, just as in the linear model above. And, again, we can look at the mean of this distribution:

EU∣X[∂m(x,U)∂x∣X=x].

This mean is called the average marginal effect, given X=x. If we assume U is independent of X, as is sometimes done, then the AME at X=x is simply

EU[∂m(x,U)∂x].

In general, an average marginal effect is just a derivative (or sometimes a finite difference), of a structural function (such as m(x,u) or βx+u) with respect to an observed variable X, averaged over an unobserved variable U, perhaps within a particular subgroup of people with X=x. The precise form of this effect depends on the specific model under consideration.

Also note that these objects might also be called average treatment effects, especially when considering a finite difference. For example, the difference of the structural function at X=1 (‘treated’) and at X=0 (‘untreated’), averaged over the unobservables.

Finally, to be clear, note that when I refer to ‘distributions’ above, I mean distributions *over the population of people*. Each person in the population has a value of U, of X, and of Y. Hence there is a distribution of these values if I look over all people in the population. The thought experiment here is the following. Take all people with X=x. Now take one of these people, and increase their X value by a small amount, but keep their U value the same, and we write down the change in their Y value. We do this for each person with X=x, and then average the values. This is what it means to average over U∣X=x.

**Attribution***Source : Link , Question Author : MarkDollar , Answer Author : kjetil b halvorsen*