Reducing Outliers of Loaded Model

Question

Reducing Outliers of Loaded Model

Opened this issue 2 months ago · 1 comments

Hi!

A month ago I created a topic model and saved it as follows: topic_model.save(outpath, serialization="safetensors").

I then reduced the outliers in the model, new_topics = topic_model.reduce_outliers(docs, topics), and used it in an empirical analysis, but I did not save the model with the updated topics.

I now want to produce visualizations of the topics used in the analysis so I have loaded my dataframe (and defined docs again), loaded the model and tried to reduce the outliers again, but I get an error and I am not sure how to fix it. The code and error are below:

loaded_model = BERTopic.load("Only-English-BERT-topic-meaning-min-size-50")
topics = loaded_model.topics_
new_topics = loaded_model.reduce_outliers(docs, topics)

sklearn.exceptions.NotFittedError: Vocabulary not fitted or provided_

I have also tried using

`topics, probs = loaded_model.transform(docs)

, but I got the same error.

Any help in how to fix this would be greatly appreciated.

Thanks in advance for your time!

Answer 1 · 2024-04-26T08:37:22.000Z

Which version of BERTopic are you using? Was it the same as when you saved the model?

Also, could you provide the full error message. It's not clear to me what the error is referencing.