Addition of useful Verbs
Closed this issue · 1 comments
koaning commented
It might make sense to think about the EmbeddingSet
as if it is a DataFrame
. There are some operations missing.
left_join
,inner_join
take the embeddings of two sets and join them together. If a word from a set is missing then we should fill it in with zeros I guess or maybe throw an error?concat
currently we have a method called "merge" for this but concatenate feels like a better term.merge
might be a nice verb for allowing a pandas dataframe to be merged. we could take a column containing names and a list of columns that can be added to the embedding as properties.sort
would be really nice. you can cluster embeddings and sort them before usingplot_distances
reverse
because why notdrop
allows you to drop some names from the embeddingset. we should allow for acopy
setting to allow the method to be used inplace. My preference is to haveinplace=False
as a default because it is easier to reason about and it allows for chaining.
koaning commented
I'm closing this issue in favour of making many small, specific, issues.