# Why does a function being smoother make it more likely?

I am currently studying the textbook Gaussian Processes for Machine Learning by Carl Edward Rasmussen and Christopher K. I. Williams. Chapter 1 Introduction says the following:

Given this training data we wish to make predictions for new inputs $$\mathbf{\mathrm{x}_*}\mathbf{\mathrm{x}_*}$$ that we have not seen in the training set. Thus it is clear that the problem at hand is inductive; we need to move from the finite training data $$\mathcal{D}\mathcal{D}$$ to a function $$ff$$ that makes predictions for all possible input values. To do this we must make assumptions about the characteristics of the underlying function, as otherwise any function which is consistent with the training data would be equally valid. A wide variety of methods have been proposed to deal with the supervised learning problem; here we describe two common approaches. The first is to restrict the class of functions that we consider, for example by only considering linear functions of the input. The second approach is (speaking rather loosely) to give a prior probability to every possible function, where higher probabilities are given to functions that we consider to be more likely, for example because they are smoother than other functions.

The second approach is (speaking rather loosely) to give a prior probability to every possible function, where higher probabilities are given to functions that we consider to be more likely, for example because they are smoother than other functions.

Why does a function being smoother make it more likely?