Apparently, a learning algorithm must make a trade off between bias and variance when producing a hypothesis. Bias means systematic deviation from data. Variance refers to the error due to fluctuations when applying the hypothesis to different training sets.
Why must there be a trade off here?
It results as a decomposition of the error function in two terms, representing “two opposing forces”, in the sense that in order to reduce the bias error, you need your model to consider more possibilities to fit the data. But this on the other side increases the variance error. Also, the other way around: if your model fits too much (starts to fit noise, which you could see as non-systematic variations on your individual samples), then you need to force your parameters not to vary too wildly, and thus introducing bias.
In more intuitive terms: bias error is being systematically wrong, and variance error is about learning all tiny, accidental variations of the samples.
Take a look at this nice article for details,