Gradients (error signals) become extremely large during backpropagation, causing huge, unstable updates to network weights, leading to divergence, loss spikes, and model failure
Gradients (error signals) become extremely large during backpropagation, causing huge, unstable updates to network weights, leading to divergence, loss spikes, and model failure