- A method to find the minimum or maximum of a function by moving along the
slope by a positive or negative direction according to the sign of the
derivative
- Process / workflow:
- goal: keep descending until the loss is minimized. Stop the process
when the loss start increasing.
- randomly take a number and calculate slope parameter
- move the slope to a positive direction by adding a small number
- move the slope to a negative direction by subtracting a small number
- How large the number should be added is determined by a parameter called l
earning rate
- too small learning rate = hardly any improvement in each iteration
- too large learning rate = sometimes we will miss the minimum
- calculate equation after each move to see which iteration provides the least loss
(the minimum loss, global extrema)