# What are real life examples of “non-parametric statistical models”?

I am reading the Wikipedia article on statistical models here, and I am somewhat perplexed as to the meaning of “non-parametric statistical models”, specifically:

A statistical model is nonparametric if the parameter set $\Theta$
is infinite dimensional. A statistical model is semiparametric if
it has both finite-dimensional and infinite-dimensional parameters.
Formally, if $d$ is the dimension of $\Theta$ and $n$ is the number of
samples, both semiparametric and nonparametric models have $d \rightarrow \infty$ as $n \rightarrow \infty$. If $d/n \rightarrow 0$
as $n \rightarrow \infty$, then the model is semiparametric;
otherwise, the model is nonparametric.

I get that if the dimension, (I take that to literally mean, the number of parameters) of a model is finite, then this is a parametric model.

What does not make sense to me, is how we can have a statistical model that has an infinite number of parameters, such that we get to call it “non-parametric”. Furthermore, even if that was the case, why the “non-“, if in fact there are an infinite number of dimensions? Lastly, since I am coming at this from a machine-learning background, is there any difference between this “non-parametric statistical model” and say, “non-parametric machine learning models”? Finally, what might some concrete examples be of such “non-parametric infinite dimensional models” be?

As Johnnyboycurtis has answerd, non-parametric methods are those if it makes no assumption on the population distribution or sample size to generate a model.

A k-NN model is an example of a non-parametric model as it does not consider any assumptions to develop a model. A Naive Bayes or K-means is an example of parametric as it assumes a distribution for creating a model.

For instance, K-means assumes the following to develop a model
All clusters are spherical (i.i.d. Gaussian).
All axes have the same distribution and thus variance.
All clusters are evenly sized.

As for k-NN, it uses the complete training set for prediction. It calculates the nearest neighbors from the test point for prediction. It assumes no distribution for creating a model.