An optimization algorithm that iteratively adjusts parameters to minimize a loss function by moving in the direction of steepest descent.
Computes the gradient of the loss function
Updates parameters in the opposite direction of the gradient
Learning rate controls the step size
Converges to local minimum over iterations