Answer : I think you're confused! Ignore the second dimension for a while, When you've 45000 points, and you use 10 fold cross-validation, what's the size of each fold? 45000/10 i.e. 4500. It means that each of your fold will contain 4500 data points, and one of those fold will be used for testing, and the remaining for training i.e. For testing: one fold => 4500 data points => size: 4500 For training: remaining folds => 45000-4500 data points => size: 45000-4500=40500 Thus, for first iteration, the first 4500 data points (corresponding to indices) will be used for testing and the rest for training. (Check below image) Given your data is x_train: torch.Size([45000, 784]) and y_train: torch.Size([45000]) , this is how your code should look like: for train_index, test_index in kfold.split(x_train, y_train): print(train_index, test_index) x_train_fold = x_train[train_index] y_train_fold = y_train[train_index] x_test_fold = x_train[test_i