# Use of nested cross-validation

Scikit Learn’s page on Model Selection mentions the use of nested cross-validation:

>>> clf = GridSearchCV(estimator=svc, param_grid=dict(gamma=gammas),
...                    n_jobs=-1)
>>> cross_validation.cross_val_score(clf, X_digits, y_digits)


Two cross-validation loops are performed in parallel: one by the GridSearchCV estimator to set gamma and the other one by cross_val_score to measure the prediction performance of the estimator. The resulting scores are unbiased estimates of the prediction score on new data.

From what I understand, clf.fit will use cross-validation natively to determine the best gamma. In that case, why would we need to use nested cv as given above? The note mentions that nested cv produces “unbiased estimates” of the prediction score. Isn’t that also the case with clf.fit?

Also, I was unable to get the clf best estimates from the cross_validation.cross_val_score(clf, X_digits, y_digits) procedure. Could you please advise how that can be done?

Nested cross-validation is used to avoid optimistically biased estimates of performance that result from using the same cross-validation to set the values of the hyper-parameters of the model (e.g. the regularisation parameter, $C$, and kernel parameters of an SVM) and performance estimation. I wrote a paper on this topic after being rather alarmed by the magnitude of the bias introduced by a seemingly benign short cut often used in the evaluation of kernel machines. I investigated this topic in order to discover why my results were worse than other research groups using similar methods on the same datasets, the reason turned out to be that I was using nested cross-validation and hence didn’t benefit from the optimistic bias.