In most of the examples I’ve seen so far of neural networks, the network is used for classification and the nodes are transformed with a sigmoid function . However, I would like to use a neural network to output a continuous real value (realistically the output would usually be in the range of -5 to +5).
My questions are:
1. Should I still scale the input features using feature scaling? What range? 2. What transformation function should I use in place of the sigmoid?
I’m looking to initially implement it PyBrain which describes these layer types.
So I’m thinking that I should have 3 layers to start (an input, hidden, and output layer) that are all linear layers? Is that a reasonable way? Or alternatively could I “stretch” the sigmoid function over the range -5 to 5?
1. Should I still scale the input features using feature scaling? What range?
Scaling does not make anything worse. Read this answer from Sarle’s neural network FAQ: Subject: Should I normalize/standardize/rescale the data? .
2. What transformation function should I use in place of the sigmoid?
You could use logistic sigmoid or tanh as activation function. That doesn’t matter. You don’t have to change the learning algorithm. You just have to scale the outputs of your training set down to the range of the output layer activation function ([0,1] or [−1,1]) and when you trained your network, you have to scale the output of your network to [−5,5]. You really don’t have to change anything else.