# Why are derived features used in neural networks?

For example, one wants to predict house prices and have two input features the length and width of the house. Sometimes, one also includes ‘derived’ polynomial input features, such as area, which is length * width.

1) What is the point of including derived features? Shouldn’t a neural network learn the connection between length, width and price during the training? Why isn’t the third feature, area, redundant?

In addition, sometimes I also see that people run genetic selection algorithms on the input features in order to reduce their number.

2) What is the point of reducing the input features if they all contain useful information? Shouldn’t the neural network assign appropriate weights to each input feature according to its importance? What is the point of running genetic selection algorithms?

1): Including derived features is a way to inject expert knowledge into the training process, and so to accelerate it. For example, I work with physicists a lot in my research. When I’m building an optimization model, they’ll give me 3 or 4 parameters, but they usually also know certain forms that are supposed to appear in the equation. For example, I might get variables $n$ and $l$, but the expert knows that $n*l$ is important. By including it as a feature, I save the model the extra effort of finding out that $n*l$ is important. Granted, sometimes domain experts are wrong, but in my experience, they usually know what they’re talking about.