It has a single-sample-based stochastic gradient descent algorithm, and a mini-batch-based one. The second one can have better performance, i.e., test accuracy, with less training iterations, if tuned properly.
The algorithms recognize MNIST with test accuracy above 97%.