This repository is about gradient descent. Linear Regression was implemented from scratch.
It is a great work to understand gradient descent's types.
These types are compared in terms of running time and cost function.
All data is used when calculating cost.Therefore, cost fucntion decreases smoothly.
It works slower than other types but it is the type that most possible type to reach local minumum.
Also, it is the type that needs the most memory.
Data was divided batches. Batch's number is generally a power of two.
Cost function calculated by using these batches. Therefore, less data is used to calculate cost function.
So, it is faster and it needs less memory. Because, all data isn't replace in RAM just one batch replaces.
Cost function can be wavy, but generally approaches local minumum.
It is the most common type espacially tasks which need more memory.
In stochastic gradient descent, cost function calculated by using just one samples.
For that reason, can't be reach local minimum. Cost function decreases and increases all the time.
However, this type need least memory, it can be used some tasks.
Also, losses np.dot() fast therefore it is slower than mini-batch gradient descent.