rasbt/python-machine-learning-book-2nd-edition

Chapter 3 typo

rickiepark opened this issue · 8 comments

Under 'Alternative implementations in scikit-learn' section, "The scikit-learn library's Perceptron and LogisticRegression classes ... make use of the LIBLINEAR libraray,..."
As you may know, Perceptron class use BaseSGDClassifier not liblinear. :)

Like a typo in p56, 'Chapter 5...' should be 'Chapter 6...' in last paragraph of p99.

In p86, "If we increase the value for \gamma, we increase the influence or ..." should be "If we increase the value for \gamma, we decrease the influence or ...".

On page 101, the paragraph below the graph reads "25 decision trees via the n_estimators parameter and used the entropy criterion". However, in the code, you use gini criterion.

rasbt commented

thanks @rickiepark

Under 'Alternative implementations in scikit-learn' section, "The scikit-learn library's Perceptron and LogisticRegression classes ... make use of the LIBLINEAR libraray,..."
As you may know, Perceptron class use BaseSGDClassifier not liblinear. :)

You are right, Perceptron doesn't use LIBLINEAR. Will added this to the errata.

In p86, "If we increase the value for \gamma, we increase the influence or ..." should be "If we increase the value for \gamma, we decrease the influence or ...".

Hm, I think that increasing gamma will lead to an increase of the influence of training points, so the original should be correct (C, the inverse regularization parameter behaves the other way around though, i.e., increasing C will lead to a smoother boundary)

Like a typo in p56, 'Chapter 5...' should be 'Chapter 6...' in last paragraph of p99.

Good point. in my version, it appears correct on pg 56 but on pg 99 it should be Chapter 6 instead of chapter 5 indeed! Will add that to the errata

screen shot 2018-11-17 at 2 47 53 pm

screen shot 2018-11-17 at 2 48 15 pm

thanks @sameervk

On page 101, the paragraph below the graph reads "25 decision trees via the n_estimators parameter and used the entropy criterion". However, in the code, you use gini criterion.

I think this is already in the errata list at https://github.com/rasbt/python-machine-learning-book-2nd-edition/tree/master/docs/errata! Thanks anyway though :) I think this error came up because I used entropy in the first edition and then when I changed the code I forgot to fix it in this paragraph as well.

Thanks for reply @rasbt ,
Please check again gamma parameter, low gamma value has more influence than high gamma value.(https://scikit-learn.org/stable/auto_examples/svm/plot_rbf_parameters.html)
Because \gamma is the inverse of \sigma. ie, \gamma=\frac{1}{2 \sigma^2}

rasbt commented

Thanks for the feedback. I think the passage is still correct though, because I was not referring to the support vectors but more the training examples in general. I.e,. I just see on that page that you link it also says:

When gamma is very small, the model is too constrained and cannot capture the complexity or “shape” of the data. The region of influence of any selected support vector would include the whole training set.

Thanks a lot. I understand it. :)

oops! should have checked that, my bad.