Issue with the Cabin feature
Opened this issue · 2 comments
btphan95 commented
Running LabelEncoder on the Cabins feature gives an error:
Pclass
Name
Sex
Age
SibSp
Parch
Ticket
Cabin
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-121-48f3aad5f78e> in <module>()
4 print(col)
5 le.fit(list(train[col]) + list(cv[col]))
----> 6 train[col] = le.transform(train[col])
7 cv[col] = le.transform(cv[col])
/opt/conda/lib/python3.6/site-packages/sklearn/preprocessing/label.py in transform(self, y)
128 y = column_or_1d(y, warn=True)
129
--> 130 classes = np.unique(y)
131 if len(np.intersect1d(classes, self.classes_)) < len(classes):
132 diff = np.setdiff1d(classes, self.classes_)
/opt/conda/lib/python3.6/site-packages/numpy/lib/arraysetops.py in unique(ar, return_index, return_inverse, return_counts, axis)
208 ar = np.asanyarray(ar)
209 if axis is None:
--> 210 return _unique1d(ar, return_index, return_inverse, return_counts)
211 if not (-ar.ndim <= axis < ar.ndim):
212 raise ValueError('Invalid axis kwarg specified for unique')
/opt/conda/lib/python3.6/site-packages/numpy/lib/arraysetops.py in _unique1d(ar, return_index, return_inverse, return_counts)
275 aux = ar[perm]
276 else:
--> 277 ar.sort()
278 aux = ar
279 flag = np.concatenate(([True], aux[1:] != aux[:-1]))
TypeError: '>' not supported between instances of 'float' and 'str'
It looks like the reason is because there are missing values in the Cabins feature. How did you overcome this?
btphan95 commented
The same issue comes up for the Embarked feature.
lenguyenthedat commented
I don't think it is supporting python 3 yet :)