I’m using the nnet package in R to attempt to build an ANN to predict real estate prices for condos (personal project). I am new to this and don’t have a math background so please bare with me.
I have input variables that are both binary and continuous. For example some binary variables which were originally yes/no were converted to 1/0 for the neural net. Other variables are continuous like
I have normalized all values to be on a 0-1 scale. Maybe
Bathroomsshouldn’t be normalized since their range is only 0-4?
Do these mixed inputs present a problem for the ANN? I’ve gotten okay results, but upon closer examination the weights the ANN has chosen for certain variables don’t seem to make sense. My code is below, any suggestions?
ANN <- nnet(Price ~ Sqft + Bedrooms + Bathrooms + Parking2 + Elevator + Central.AC + Terrace + Washer.Dryer + Doorman + Exercise.Room + New.York.View,data[1:700,], size=3, maxit=5000, linout=TRUE, decay=.0001)
Based on the comments below regarding breaking out the binary inputs into separate fields for each value class, my code now looks like:
ANN <- nnet(Price ~ Sqft + Studio + X1BR + X2BR + X3BR + X4BR + X1Bath + X2Bath + X3Bath + X4bath + Parking.Yes + Parking.No + Elevator.Yes + Elevator.No + Central.AC.Yes + Central.AC.No + Terrace.Yes + Terrace.No + Washer.Dryer.Yes + Washer.Dryer.No + Doorman.Yes + Doorman.No + Exercise.Room.Yes + Exercise.Room.No + New.York.View.Yes + New.York.View.No + Healtch.Club.Yes + Health.Club.No, data[1:700,], size=12, maxit=50000, decay=.0001)
The hidden nodes in the above code are 12, but I’ve tried a range of hidden nodes from 3 to 25 and all give worse results than the original parameters I had above in the original code posted. I’ve also tried it with linear output = true/false.
My guess is that I need to feed the data to nnet in a different way because it’s not interpreting the binary input properly. Either that, or I need to give it different parameters.
One way to handle this situation is to rescale the inputs so that their variances are on roughly the same scale. This advice is generally given for regression modeling, but it really applies to all modeling situations that involve variables measured on different scales. This is because the variance of a binary variable is often quite different from the variance of a continuous variable. Gelman and Hill (2006) recommend rescaling continuous inputs by two standard deviations to obtain parity with (un-scaled) binary inputs. This recommendation is also reflected in a paper and blog post.
A more specific recommendation for neural networks is to use “effect coding” for binary inputs (that is, -1 and 1) instead of “dummy coding” (0 and 1), and to take the additional step of centering continuous variables. These recommendations come from an extensive FAQ by Warren Sarle, in particular the sections “Why not code binary inputs as 0 and 1?” and “Should I standardize the input variables?” The gist, though, is the same:
The contribution of an input will depend heavily on its variability relative to other inputs.
As for unordered categorical variables — you must break them out into binary indicators. They simply are not meaningful otherwise.