Starting with Machine Learning on devices like MCU - Part 7 (when do we say model has converged?)

Starting with Machine Learning on devices like MCU - Part 7 (when do we say model has converged?)

In machine learning, we say a model has converged when further training does not lead to significant improvement in performance. Convergence usually refers to the point where the optimization algorithm (like gradient descent) has reached a stable solution. 

Key indicators of convergence:

Loss stabilization:

  • During training, the loss function (e.g., MSE, cross-entropy) keeps decreasing.
  • Convergence occurs when the loss stops decreasing significantly over successive iterations/epochs. Graphically, the loss curve flattens out.

Accuracy or performance plateau:

  • On a validation set, metrics like accuracy, F1-score, or RMSE no longer improve meaningfully.
  • Improvement below a threshold (e.g., <0.01% per epoch) can signal convergence.

Gradients approach zero (for gradient-based methods):

  • The weight updates become very small because the gradient of the loss w.r.t. parameters is near zero.

Early stopping criteria:

  • We explicitly define convergence by stopping training when validation loss doesn’t improve for n consecutive epochs.

Practical convergence:

  • Sometimes we accept “good enough” convergence if the model reaches satisfactory performance even if it hasn’t perfectly minimized the loss.

Post a Comment

0 Comments