I am going to use KL divergence in my python code and I got this tutorial.

On that tutorial, to implement KL divergence is quite simple.

`kl = (model * np.log(model/actual)).sum()`

As I understand, the probability distribution of

`model`

and`actual`

should be <= 1.My question is, what’s the maximum bound/maximum possible value of k?. I need to know the maximum possible value of kl distance as for the maximum bound in my code.

**Answer**

Or even with the same support, when one distribution has a much fatter tail than the other. Take

KL(P||Q)=∫p(x)log(p(x)q(x))dx

when

p(x)=Cauchy density⏞1π11+x2q(x)=Normal density⏞1√2πexp{−x2/2}

then

KL(P||Q)=∫1π11+x2logp(x)dx+∫1π11+x2[log(2π)/2+x2/2]dx

and

∫1π11+x2x2/2dx=+∞

There exist other distances that remain bounded such as

- the L¹ distance, equivalent to the total variation distance,
- the Wasserstein distances
- the Hellinger distance

**Attribution***Source : Link , Question Author : user46543 , Answer Author : Xi’an*