# Why are neural networks smooth functions?

I am reading Chapter 11 of Elements of Statistical learning and came across this sentence:

“Unlike methods like CART and MARS, neural networks are smooth
functions of real-valued parameters”

What is meant by ‘smooth functions’ here? I have come across things such as smoothing splines, but am unsure what a ‘smooth function’ means more generally.

Following on from the above, what makes neural networks specifically smooth functions?

A smooth function has continuous derivatives, up to some specified order. At the very least, this implies that the function is continuously differentiable (i.e. the first derivative exists everywhere and is continuous). More specifically, a function is $$CkC^k$$ smooth if the 1st through $$kk$$th order derivatives exist everywhere, and are continuous.
Neural nets can be written as compositions of elementary functions (typically affine transformations and nonlinear activation functions, but there are other possibilities). For example, in feedforward networks, each layer implements a function whose output is passed as input to the next layer. Historically, neural nets have tended to be smooth, because the elementary functions used to construct them were themselves smooth. In particular, nonlinear activation functions were typically chosen to be smooth sigmoidal functions like $$tanh\tanh$$ or the logistic sigmoid function.