Stochastic Gradient Descent (SGD) is an iterative method for optimizing an objective function, typically used to train machine learning models. It updates parameters of the model by moving in the direction of the steepest descent, as defined by the negative of the gradient. Unlike traditional gradient descent, which uses the entire data set to compute the gradient, SGD randomly selects a subset of data at each step. This makes SGD faster and more scalable, especially for large datasets.