I was hoping to understand what the smooth $l_1$ loss does, but I’m not able to find any good explanation of online, I know $l_1$ loss calculates the absolute error, but what is the use of smooth $L_1$, any answers would be helpful.

**Answer**

Smooth L1-loss can be interpreted as a combination of L1-loss and L2-loss. It behaves as L1-loss when the absolute value of the argument is high, and it behaves like L2-loss when the absolute value of the argument is close to zero. The equation is:

$L_{1;smooth} = \begin{cases}|x| & \text{if $|x|>\alpha$;} \\

\frac{1}{|\alpha|}x^2 & \text{if $|x| \leq \alpha$}\end{cases}$

$\alpha$ is a hyper-parameter here and is usually taken as 1. $\frac{1}{\alpha}$ appears near $x^2$ term to make it continuous.

Smooth L1-loss combines the advantages of L1-loss (steady gradients for large values of $x$) and L2-loss (less oscillations during updates when $x$ is small).

Another form of smooth L1-loss is Huber loss. They achieve the same thing. Taken from Wikipedia, Huber loss is

$

L_\delta (a) = \begin{cases}

\frac{1}{2}{a^2} & \text{for } |a| \le \delta, \\

\delta (|a| – \frac{1}{2}\delta), & \text{otherwise.}

\end{cases}

$

**Attribution***Source : Link , Question Author : Ryan , Answer Author : Gautam Sreekumar*