Add proper parsers to Language objects.
koaning opened this issue · 1 comments
koaning commented
The lang[['word', 'person']]
syntax is cute, but for convenience we just need more.
I'd like to also have the following;
lang.parse_dataframe(df, text_col="text", properties={"propname": "colname"}, strategy="first")
lang.parse_list(textlist, strategy="first")
lang.parse_dictlist(dictlist, text="text", properties={"propname": "colname"}, strategy="first")
These methods should make it much easier to add properties when reading in a dataset. The strategy
param is there to filter the texts beforehand in order to ensure that the embeddingset remains a set.
koaning commented
I'm closing issues because ever since the project moved to my personal account it's been more into maintenance mode than a "active work" mode.