Machine Learning NYSC example error
abhi-84 opened this issue · 1 comments
Hi
In nysc_predict.py example, when I remove @njit(parallel=True), it throws error. Why? How can I use SDC for any Machine learning examples?
Please find the error it generates.
KeyError Traceback (most recent call last)
in
41
42 #t_start = time.time()
---> 43 data2012, coc12, data2012_low2high, x, y = preprocess_data()
44
45 x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, shuffle=False)
in preprocess_data()
20
21 # Remove unused columns
---> 22 df_prices_intc = df_prices_intc.drop(columns=('symbol', 'volume'))
23
24 # The year of interest is 2012
~/anaconda3/envs/sdc_env/lib/python3.7/site-packages/pandas/core/frame.py in drop(self, labels, axis, index, columns, level, inplace, errors)
4115 level=level,
4116 inplace=inplace,
-> 4117 errors=errors,
4118 )
4119
~/anaconda3/envs/sdc_env/lib/python3.7/site-packages/pandas/core/generic.py in drop(self, labels, axis, index, columns, level, inplace, errors)
3912 for axis, labels in axes.items():
3913 if labels is not None:
-> 3914 obj = obj._drop_axis(labels, axis, level=level, errors=errors)
3915
3916 if inplace:
~/anaconda3/envs/sdc_env/lib/python3.7/site-packages/pandas/core/generic.py in _drop_axis(self, labels, axis, level, errors)
3944 new_axis = axis.drop(labels, level=level, errors=errors)
3945 else:
-> 3946 new_axis = axis.drop(labels, errors=errors)
3947 result = self.reindex(**{axis_name: new_axis})
3948
~/anaconda3/envs/sdc_env/lib/python3.7/site-packages/pandas/core/indexes/base.py in drop(self, labels, errors)
5338 if mask.any():
5339 if errors != "ignore":
-> 5340 raise KeyError("{} not found in axis".format(labels[mask]))
5341 indexer = indexer[~mask]
5342 return self.delete(indexer)
KeyError: "[('symbol', 'volume')] not found in axis"
Hello @abhi-84 ,
This exception was raised because method DataFrame.drop
is slightly different in use in Pandas and SDC due to a limitation on parameter columns
. Pandas requires parameter columns
to be a list, but SDC requires a tuple (DataFrame.drop limitations).
So if you replace tuple of columns
df_prices_intc = df_prices_intc.drop(columns=('symbol', 'volume'))
with a list of columns
df_prices_intc = df_prices_intc.drop(columns=['symbol', 'volume'])
in case of removed decorator njit
, the issue will disappear.