A problem in training deep neural networks where gradients, used in the optimization of the network, become very small, effectively preventing the network from learning further. This often occurs in networks with many layers, leading to minimal updates to the weights during training, which stalls the learning process.