I was reading the FaceNet paper and in the 3rd paragraph of the introduction it says:
Previous face recognition approaches based on deep networks
use a classification layer trained over a set of
known face identities and then take an intermediate bottleneck
layer as a representation used to generalize recognition
beyond the set of identities used in training.
I was wondering what they mean by an intermediate bottleneck layer?
A bottleneck layer is a layer that contains few nodes compared to the previous layers. It can be used to obtain a representation of the input with reduced dimensionality. An example of this is the use of autoencoders with bottleneck layers for nonlinear dimensionality reduction.
My understanding of the quote is that previous approaches use a deep network to classify faces. They then take the first several layers of this network, from the input up to some intermediate layer (say, the kth layer, containing nk nodes). This subnetwork implements a mapping from the input space to an nk-dimensional vector space. The kth layer is a bottleneck layer, so the vector of activations of nodes in the kth layer gives a lower dimensional representation of the input. The original network can’t be used to classify new identities, on which it wasn’t trained. But, the kth layer may provide a good representation of faces in general. So, to learn new identities, new classifier layers can be stacked on top of the kth layer and trained. Or, the new training data can be fed through the subnetwork to obtain representations from the kth layer, and these representations can be fed to some other classifier.