Anscombe’s quartet demonstrates a very significant idea: we need to visualize data before analyzing it. The quartet consists of four hypothetical datasets each containing eleven data points.
Whereas all these datasets have essentially the same descriptive statistics including the mean, variance, correlation, and regression line, they have very different distributions when graphed.
Continue reading “Analyzing Machine Learning Models with Yellowbrick”