Reading here, I found:

If the discriminator wins by too big a margin, then the generator

can’t learn as the discriminator error is too small.This is something I read somewhere else as well, but I can’t really get it. If the discriminator has a low loss, it means that when I give it a fake sample (with “fake” label) it gives me a low score (assuming its output is “probability of real”) with high certainty, so I can imagine that the gradient of the error will be small.

When I train the generator, I pass the same fake image, but with the “real” label. In this case, I expect that the gradient of the error should be high, since we are basically telling the discriminator that it’s making a mistake (and a big one, if the discriminator loss was low), so the error gradient should be high, and this gradient will be the one going to the generator for training.

**Answer**

You might find the answer in this paper “Towards principled methods for training generative adversarial networks” (https://arxiv.org/pdf/1701.04862.pdf). It has a part explaining why the generator’s gradient vanishes as the discriminator gets stronger.

**Attribution***Source : Link , Question Author : rand , Answer Author : kangzheng*