Role of Dirac function in particle filters

Particle approximations to probability densities are often introduced as a weighted sum of Dirac functions

$$p(x) \approx \sum_{i=1}^N \omega^i \delta(x-x^i)$$

with the weights

$$\omega^i \propto \frac{p(x^i)}{q(x^i)}$$

normalized such that they sum to unity; where $q(\cdot)$ is the importance density. I understand that the Dirac function becomes infinitely large at a point $p$, that is $\delta(p) = \infty$ and that it is zero elsewhere, that is $\delta(x) = 0 ~\forall x \neq p$. Also, I understand that the Dirac function integrated over the mass point takes the value of unity.

My questions are:

  1. What is the relationship between the support of the particle approximation and the Dirac function?
  2. Why is a summation sign used when evaluating $\delta$ can only ever yield a value of 0 or infinity? Shouldn’t this be an integral instead?
  3. How can the notion of the support of a function be extended to a set of points (e.g., $x_t^{(i)}$), which isn’t itself a function?
  4. How can a representation of a probability density function arise from a weighted sum of $\delta(\cdot)$s that themselves take only values of either zero or infinity?

Thank you for any clarifications you may be able to provide.

Answer

@user20160 has already given you nice answer to your (1)-(3) questions, but the last one seems to be not yet fully answered.

  1. How can a representation of a probability density function arise from a weighted sum of $\delta(\cdot)$s that themselves take only
    values of either zero or infinity?

Let me start with quoting Wikipedia as it provides a pretty clear description in this case (notice the bolds I added):

The Dirac delta can be loosely thought of as a function on the real
line which is zero everywhere except at the origin, where it is
infinite,

$$\delta(x) = \begin{cases} +\infty, & x = 0 \\ 0, & x \ne 0
\end{cases}$$

and which is also constrained to satisfy the identity

$$\int_{-\infty}^\infty \delta(x) \, dx = 1$$

This is merely a heuristic characterization. The Dirac delta is not a
function
in the traditional sense as no function defined on the real
numbers has these properties
. The Dirac delta function can be
rigorously defined either as a distribution or as a measure.

Further on, Wikipedia provides more formal definition and lots of worked examples, so I’d recommend you go through the whole article. Let me quote one example from it:

In probability theory and statistics, the Dirac delta function is
often used to represent a discrete distribution, or a partially
discrete, partially continuous distribution, using a probability
density function (which is normally used to represent fully continuous
distributions). For example, the probability density function $f(x)$
of a discrete distribution consisting of points $x = \{x_1, \dots,
x_n\}$, with corresponding probabilities $p_1, \dots, p_n$, can be
written as

$$ f(x) = \sum_{i=1}^n p_i \delta(x-x_i) $$

What this equation is saying is that we take sum over $n$ continuous distributions $\delta_{x_i} = \delta(x-x_i)$ that have all their mass around $x_i$’s. If you’d try to imagine $\delta_{x_i}$ distributions in terms of cumulative distribution functions, it needs to be

$$
F_{x_i}(x) =
\begin{cases}
0 & \text{if } x < x_i \\
1 & \text{if } x \ge x_i
\end{cases}
$$

So we can re-write previous density to cumulative distribution function

$$ F(x) = \sum_{i=1}^n p_i F_{x_i}(x) = \sum_{i=1}^n p_i \mathbf{1}_{x \ge x_i} $$

where $\mathbf{1}_{x \ge x_i}$ is an indicator function pointing at $x_i$. Notice that this basically is a categorical distribution in disguise. Moreover, you can define Dirac delta in terms of arbitrary function

$$ \int_{-\infty}^\infty f(x) \delta(x-x_i) dx = f(x_i) $$

so it “works” as continuous version of indicator function.

The take-away message is that Dirac delta is not a standard function. It’s also not equal to infinity at zero — if it was, it would be useless because infinity is not a number, so we couldn’t perform any arithmetic operations over it. You can think of Dirac delta simply as an indicator function pointing at some $x_i$ that is continuous and integrates to unity. No black magic involved, it is just a way to hack the calculus to deal with discrete values.

Attribution
Source : Link , Question Author : Constantin , Answer Author : Tim

Leave a Comment