Question:
after splitting the set with the ''sklearn.cross_validation.KFold'' method, I have 6 chunks (3 train ,3 test,+ responses for them) . Is there a function with which you can train the algorithm just to throw all the chunks or you need to constantly write:
Vasya=model.fit(chank1,answer1)
a1=model.predict(Vasya,answer_t_1)
?
Answer:
Cross validation is built into sklearn. If you need to test the model on different folds using KFold
then the easiest way is cross_val_score or cross_val_predict
-
cross_val_score(model,chank1,answer1,cv=n)
will give scores for folds -
cross_val_score(model,chank1,answer1,cv=n)
will give all X predictions
But usually hyperparameters are selected by cross-validation; for this, there is GridSearchCV , which can be given a grid of parameters and "itself" will select the best combination.
NB all these functions have the n_jobs
parameter, so it's worth it not to write cycles with handles n_jobs = -1
will put the machine down for the duration of the work by loading all the processes – this is something that is not so easy to do in python.