In machine learning, we say a model has converged when further training does not lead to significant improvement in performance. Convergence usually refers to the point where the optimization algorithm (like gradient descent) has reached a stable solution.
Key indicators of convergence:
Loss stabilization:
- During training, the loss function (e.g., MSE, cross-entropy) keeps decreasing.
- Convergence occurs when the loss stops decreasing significantly over successive iterations/epochs. Graphically, the loss curve flattens out.
Accuracy or performance plateau:
- On a validation set, metrics like accuracy, F1-score, or RMSE no longer improve meaningfully.
- Improvement below a threshold (e.g., <0.01% per epoch) can signal convergence.
Gradients approach zero (for gradient-based methods):
- The weight updates become very small because the gradient of the loss w.r.t. parameters is near zero.
Early stopping criteria:
- We explicitly define convergence by stopping training when validation loss doesn’t improve for n consecutive epochs.
Practical convergence:
- Sometimes we accept “good enough” convergence if the model reaches satisfactory performance even if it hasn’t perfectly minimized the loss.
0 Comments