Word Frequencies with TfidfVectorizer code
Opened this issue · 0 comments
imsanjoykb commented
from sklearn.feature_extraction.text import TfidfVectorizer
list of text documents
text = ["The quick brown fox jumped over the lazy dog.",
"The dog.",
"The fox"]
create the transform
vectorizer = TfidfVectorizer()
tokenize and build vocab
vectorizer.fit(text)
summarize
print(vectorizer.vocabulary_)
print(vectorizer.idf_)
encode document
vector = vectorizer.transform([text[0]])
summarize encoded vector
print(vector.shape)
print(vector.toarray())