makcedward/nlpaug

Use NLP with dataframes and labels

haris525 opened this issue · 1 comments

Hello

I have a similar issue that someone else asked about. I have a dataframe with text column, and classes column. I would like to augment the text column based on classes as some classes are underrepresented and I would like to balance them a bit more, I have about 5 classes. How would I go about doing this?

I tried following the approach here

#209

here is my code


aug_data = []
for group, d in mydataframe.groupby(['class']):
  a_data = aug_wordnet.augment(d)
  a_data = pd.DataFrame(aug_data, columns=['text'])
  a_data['class'] = class
  aug_data.append(a_data)

aug_data = pd.concat(aug_data)

but it gives me the error message AttributeError: 'DataFrame' object has no attribute 'strip'. My class column is int64, and text column is object64

Thanks

Consider to use the following sample code

aug_data = []
for group, d in mydataframe.groupby(['class']):
  a_data = aug_wordnet.augment(d["your column"].tolist())
  a_data = pd.DataFrame(aug_data, columns=['text'])
  a_data['class'] = class
  aug_data.append(a_data)

aug_data = pd.concat(aug_data)