Use NLP with dataframes and labels
haris525 opened this issue · 1 comments
haris525 commented
Hello
I have a similar issue that someone else asked about. I have a dataframe with text column, and classes column. I would like to augment the text column based on classes as some classes are underrepresented and I would like to balance them a bit more, I have about 5 classes. How would I go about doing this?
I tried following the approach here
here is my code
aug_data = []
for group, d in mydataframe.groupby(['class']):
a_data = aug_wordnet.augment(d)
a_data = pd.DataFrame(aug_data, columns=['text'])
a_data['class'] = class
aug_data.append(a_data)
aug_data = pd.concat(aug_data)
but it gives me the error message AttributeError: 'DataFrame' object has no attribute 'strip'. My class column is int64, and text column is object64
Thanks
makcedward commented
Consider to use the following sample code
aug_data = []
for group, d in mydataframe.groupby(['class']):
a_data = aug_wordnet.augment(d["your column"].tolist())
a_data = pd.DataFrame(aug_data, columns=['text'])
a_data['class'] = class
aug_data.append(a_data)
aug_data = pd.concat(aug_data)