I’m a beginner trying to put together my first project. I had a song classification project in mind, but since I would be manually labeling, I could only reasonably put together about 1000 songs, or 60 hours of music.
I would be classifying with several classes, so it’s possible that one class would have as few as 50-100 songs in the training set- this seems like too few! Is there a general rule of thumb for how much data is needed to train a neural network to give it a shot at working?
Edit: I was thinking of using a vanilla LSTM. The input features will have dimension 39, output dimension 6, my first attempt for hidden layer dimension would be 100.
It really depends on your dataset, and network architecture. One rule of thumb I have read (2) was a few thousand samples per class for the neural network to start to perform very well.
In practice, people try and see. It’s not rare to find studies showing decent results with a training set smaller than 1000 samples.
A good way to roughly assess to what extent it could be beneficial to have more training samples is to plot the performance of the neural network based against the size of the training set, e.g. from (1):
- (1) Dernoncourt, Franck, Ji Young Lee, Ozlem Uzuner, and Peter Szolovits. “De-identification of Patient Notes with Recurrent Neural Networks” arXiv preprint arXiv:1606.03475 (2016).
(2) Cireşan, Dan C., Ueli Meier, and Jürgen Schmidhuber. “Transfer learning for Latin and Chinese characters with deep neural networks.” In The 2012 International Joint Conference on Neural Networks (IJCNN), pp. 1-6. IEEE, 2012. https://scholar.google.com/scholar?cluster=7452424507909578812&hl=en&as_sdt=0,22 ; http://people.idsia.ch/~ciresan/data/ijcnn2012_v9.pdf:
For classification tasks with a few thousand samples per
class, the benefit of (unsupervised or supervised) pretraining is not easy to demonstrate.