Python Machine Learning

K-fold vs. Leave One Out cross validation

Both K-fold and Leave One Out (LOOC) cross validation methods are performed on the training set to estimate the performance of the trained model on unseen data.

K-fold cross validation has a single parameter called k that refers to the number of groups that a given data sample is to be split into (usually k=10). K-fold CV splits the data randomly into k groups, and for each unique group, trains a regression model with that group as a hold out and all other groups used as training data. The model is evaluated on the group that was held out, then discarded and the evaluation score retained. The performance of the model is then calculated based on all the evaluation scores obtained. Each data point will thus have the opportunity to be used once in the hold-out set and k-1 times in the training set.

The LOOC works the same way as a K-fold CV where k=n.

Comparing K-fold and LOOC

Biases

K-fold CV gives a pessimistically biased estimate of performance as it limits the size of the training data, and most statistical models will improve if the training set is made larger. The K-fold CV estimates the performance of a model trained on a dataset 100*(k-1)/k% of the available data, rather than on 100% of it. So if cross-validation is performed to estimate performance, a model using 100% of the data for operational use will perform slightly better than the cross-validation estimate suggests.
Leave-one-out cross-validation is approximately unbiased, because the difference in size between the training set used in each fold and the entire dataset is only a single data point.

Variance

LOOC tends to have a high variance as very different estimates will be obtained if the estimate is repeated with different initial samples of data from same distribution, whereas K-fold has a relatively lower variance.
However, with a small dataset, the variance in fitting the model tends to be higher as it is more sensitive to any noise/sampling artifacts in the particular training sample used. This means that k-fold cross-validation is likely to have a high variance if there is only a limited amount of data, as the size of the training set will be smaller than for LOOCV.

Computation

LOOCV is more computationally intensive than K-fold CV.

As the error of the estimator is a combination of bias and variance, whether leave-one-out cross-validation is better than k-fold cross-validation depends on both quantities (and also the computational budget of the project). For very small datasets, LOOCV is preferred.

Source: https://stats.stackexchange.com/