In

Elements of Statistical Learning, a problem is introduced to highlight issues with k-nn in high dimensional spaces. There are $N$ data points that are uniformly distributed in a $p$-dimensional unit ball.The median distance from the origin to the closest data point is given by the expression:

$$d(p,N) = \left(1-\left(\frac{1}{2}\right)^\frac{1}{N}\right)^\frac{1}{p}$$

When $N=1$, the formula breaks down to half the radius of the ball, and I can see how the closest point approaches the border as $p \rightarrow \infty$, thus making the intuition behind knn break down in high dimensions. But I can’t grasp why the formula has a dependence on N. Could someone please clarify?

Also the book addresses this issue further by stating: “…prediction is much more difficult near the edges of the training sample. One must extrapolate from neighboring sample points rather than interpolate between them”. This seems like a profound statement, but I can’t seem to grasp what it means. Could anyone reword?

**Answer**

The volume of an $p$-dimensional hyperball of radius $r$ has a volume proportional to $r^p$.

So the proportion of the volume more than a distance $kr$ from the origin is $\frac{r^p-(kr)^p}{r^p}=1-k^p$.

The probability that all $N$ randomly chosen points are more than a distance $kr$ from the origin is $\left(1-k^p\right)^N$. To get the median distance to the nearest random point, set this probability equal to $\frac12$. So $$\left(1-k^p\right)^N=\tfrac12 $$ $$\implies k=\left(1-\tfrac1{2^{1/N}}\right)^{1/p}.$$

Intuitively this makes some sort of sense: the more random points there are, the closer you expect the nearest one to the origin to be, so you should expect $k$ to be a decreasing function of $N$. Here $2^{1/N}$ is a decreasing function of $N$, so $\tfrac1{2^{1/N}}$ is an increasing function of $N$, and thus $1-\tfrac1{2^{1/N}}$ is a decreasing function of $N$ as is its $p$th root.

**Attribution***Source : Link , Question Author : user64773 , Answer Author : Henry*