MaartenGr/BERTopic

bertopic version 0.16.0 - when adding representation model together with zeroshot_topic_list end with failure

Opened this issue · 3 comments

from bertopic import BERTopic

2024-05-02 10:26:56,345 - BERTopic - Zeroshot Step 2 - Completed ✓
2024-05-02 10:26:56,346 - BERTopic - Zeroshot Step 3 - Combining clustered topics with the zeroshot model
KeyError: '-1'
File , line 18
1 from bertopic import BERTopic
3 topic_model = BERTopic(
4
5 # Pipeline models
(...)
15 verbose=True
16 )
---> 18 topics, probs = topic_model.fit_transform(docs, embeddings)
File /local_disk0/.ephemeral_nfs/envs/pythonEnv-c320d35e-2ba0-4086-9066-6452698cd8ba/lib/python3.11/site-packages/bertopic/_bertopic.py:3150, in BERTopic.merge_models(cls, models, min_similarity, embedding_model)
3147 merged_topics["topic_labels"][str(new_topic_val)] = selected_topics["topic_labels"][str(new_topic)]
3149 if selected_topics["topic_aspects"]:
-> 3150 merged_topics["topic_aspects"][str(new_topic_val)] = selected_topics["topic_aspects"][str(new_topic)]
3152 # Add new embeddings
3153 new_tensors = tensors[new_topic - selected_topics["_outliers"]]

topic_model = BERTopic(

Pipeline models

embedding_model=embedding_model,
umap_model=umap_model,
hdbscan_model=hdbscan_model,
vectorizer_model=vectorizer_model,
zeroshot_topic_list=zero_shot_topics_list,
zeroshot_min_similarity=.8,
representation_model=representation_model,

Hyperparameters

top_n_words=10,
verbose=True
)

topics, probs = topic_model.fit_transform(docs, embeddings)

This was indeed an issue with 0.16.0 but might be fixed with 0.16.1 but I'm not sure if it will work. There's currently a PR open for 0.16.1 that fixes another issue.

The reason i work with 0.16.0 is because zero shot is failing on 0.16.1. i saw there are opened cases for that already

Have you tried 0.16.1 with the PR I mentioned above? I think that should solve your issue.