Training Data
Test Data (New)
Model Complexity
Degrees of Freedom
1
Error Metrics
Training Error45%
How well we know the current data.
Generalization Error50%
How well we predict NEXT data.
UnderfittingThe model is too simple to capture the underlying trend.
Builder Note
Overfitting is the #1 killer of real-world ML. More parameters isn't always better. Always hold out "Test Data" that the model never sees during training to measure true performance.
The Overfitting Trap
When a model is too complex, it starts memorizing the noise (random fluctuations) in the training data rather than the actual signal. This makes it look like a genius on the "training set" but a failure on "new data."
Builder Strategy
To fight overfitting, use Regularization (penalizing complexity) or Early Stopping. The most important rule: never evaluate your model on your training data.