A good data scientist is not one who knows all the fancy algorithms but one who knows that he/she is overfitting. We all have been through that time when our super awesome, fully tuned model has failed to live up to the expectations on Kaggle private LB or after deployment. Knowing how to get an unbiased estimate of the predictive power of our model is important. There are different validation strategies like holdout and cross validation which are commonly used in practice for this. But which strategy is appropriate in which scenario is something that needs more discussion and thought.
Continue reading Machine Learning Model Evaluation & Selection