When Fisher Scoring diverges

Using Fisher scoring to fit L1-penalized logistic regression models can diverge, even if the dataset is not perfectly separable.
It's because it relies on Newton, the corresponding quadratic approximation of the loss landscape holding only locally.
Thus, we diverge if we start too far away from the optimum.
It's typically the case on example synthetic datasets showcased in here.

Arnaud15/diverging_fisher_scoring