/ml_helper

Some help functions for machine learning preprocessing

Primary LanguagePython

Machine Learning Helper

Some help functions for machine learning preprocessing

For dealing with categorical features:

  • replace_string_with_enumerated_cols
  • replace_string_with_hash_cols
  • binarize_string_cols

Examples

data = pd.DataFrame({'pet':      ['cat', 'dog', 'dog', 'fish', 'cat', 'dog', 'cat'],
                         'children': [4, 0, 3, 1, 2, 3, 0],
                         'job': ['engineer', 'developer', 'manager', 'developer', 'manager', 'manager', 'developer'],
                         'salary':   [90, 24, 44, 27, 32, 59, 36]})
print(data)

b = binarize_string_cols(data, ['pet', 'job'])
print(b)
h = replace_string_with_hash_cols(data, ['pet', 'job'])
print(h)
e = replace_string_with_enumerated_cols(data, ['pet', 'job'])
print(e)