justmarkham/DAT4

Question about -- In[21]: lm.predict([100, 25, 25]) -- in 08_linear_regression.ipynb

mlukjanska opened this issue · 3 comments

Hi!

First of all, thanks for the great tutorial! :)

Next, I was following the notes and when trying to run the code snippet

In[21]: lm.predict([100, 25, 25]) 

for my own problem I got the following error:

/var/ml/python/local/lib/python2.7/site-packages/sklearn/utils/validation.py:386: DeprecationWarning: Passing 1d arrays as data is deprecated in 0.17 and willraise ValueError in 0.19. Reshape your data either using X.reshape(-1, 1) if your data has a single feature or X.reshape(1, -1) if it contains a single sample.
  DeprecationWarning)

Adding additional pair of brackets solved it, meaning

lm.predict([[100, 25, 25]])  

I assume the implementation has changed for sklearn.linear_model?

Reference: https://github.com/justmarkham/DAT4/blob/master/notebooks/08_linear_regression.ipynb

Thanks for pointing this out!

Yes, it appears that scikit learn has deprecated passing in a simple one dimensional array. They want you to pass a column of a particular matrix/dataframe, i.e. a two dimensional array. That's why adding the second pair brackets solved the problem.

@mlukjanska Here is a longer explanation from my blog, if you're interested, with three options listed for how to resolve the issue: http://www.dataschool.io/linear-regression-in-python/#comment-2521926219

@justmarkham
Thanks! This explains everything :)