I am reading the Wikipedia article on statistical models here, and I am somewhat perplexed as to the meaning of “non-parametric statistical models”, specifically:

A statistical model is

nonparametricif the parameter set $\Theta$

is infinite dimensional. A statistical model issemiparametricif

it has both finite-dimensional and infinite-dimensional parameters.

Formally, if $d$ is the dimension of $\Theta$ and $n$ is the number of

samples, both semiparametric and nonparametric models have $d

\rightarrow \infty$ as $n \rightarrow \infty$. If $d/n \rightarrow 0$

as $n \rightarrow \infty$, then the model is semiparametric;

otherwise, the model is nonparametric.I get that if the

dimension, (I take that to literally mean, the number of parameters) of a model is finite, then this is a parametric model.What does not make sense to me, is how we can have a statistical model that has an

infinitenumber of parameters, such that we get to call it “non-parametric”. Furthermore, even if that was the case, why the “non-“, if in fact there are an infinite number of dimensions? Lastly, since I am coming at this from a machine-learning background, is there any difference between this “non-parametric statistical model” and say, “non-parametric machine learning models”? Finally, what might some concrete examples be of such “non-parametric infinite dimensional models” be?

**Answer**

As Johnnyboycurtis has answerd, non-parametric methods are those if it makes no assumption on the population distribution or sample size to generate a model.

A k-NN model is an example of a non-parametric model as it does not consider any assumptions to develop a model. A Naive Bayes or K-means is an example of parametric as it assumes a distribution for creating a model.

For instance, K-means assumes the following to develop a model

All clusters are spherical (i.i.d. Gaussian).

All axes have the same distribution and thus variance.

All clusters are evenly sized.

As for k-NN, it uses the complete training set for prediction. It calculates the nearest neighbors from the test point for prediction. It assumes no distribution for creating a model.

For more info:

**Attribution***Source : Link , Question Author : Creatron , Answer Author : Community*