MaartenGr/BERTopic

Quiet .merge_topics()

Opened this issue · 9 comments

When running the topic_model.merge_topics() function it prints a TQDM progress bar during the merge. I instantiated BERTopic() with verbose=False.

Is there a way to quiet the merge_topics() function?

Could you show which TQDM bar is being shown? Looking at the code of .merge_topics I do not see an additional TQDM bar aside from what might be shown in related functions. Also, which version of BERTopic are you using?

Maybe this comes from re-running the representation model after it merges?

Hmm, not too sure what is happening here. Could you provide the full logging (including the tqdm bar), along with the full code and the BERTopic version?

Function used for merging:

def merge_topics(distance_threshold, hierarchy_var, topic_model, data):
        topics_to_merge = []

        for merge_candidate in hierarchy_var.iterrows():
            distance = merge_candidate[1][-1]
            if distance <= distance_threshold:
                topics_to_merge.append(merge_candidate[1][2])

        topic_model.merge_topics(data, topics_to_merge)
        hierarchy_tree = topic_model.hierarchical_topics(data)

        topic_df = pd.DataFrame(
            {"Document": data, "Topic": topic_model.topics_})

        return [hierarchy_tree,
                topic_model,
                topic_df,
                topic_model.visualize_hierarchy()]

Initializing BERTopic like this:

chain = load_qa_chain(Ollama(model="zephyr"), chain_type="stuff")

representation_model = {
    "LLM Summary": LangChain(chain=chain, diversity=0.7, nr_docs=10)
    }

topic_model = BERTopic(representation_model=representation_model,
                       verbose=False,
                       language="multilingual",
                       nr_topics="auto")

Version:
bertopic 0.16.0 pypi_0 pypi

Hmmm, that might be the LangChain backend that you are using but I'm not sure. Do you also get this progress bar when you run .fit? Also, I'm not seeing you actually fitting the model, is that correct?

Hmmm, that might be the LangChain backend that you are using but I'm not sure. Do you also get this progress bar when you run .fit? Also, I'm not seeing you actually fitting the model, is that correct?

I fit the model using .fit_transform(). I just didn't include that in the code. The progress bar seems to be related to the .merge_topics(). I don't notice the same progress bar during .fit() or .fit_transform().

I found it. It's .hierarchical_topics(). Line 1003 in _bertopic.py.

Ah, I was looking at .merge_topics since you mentioned it gave a TQDM bar there. No wonder I couldn't find it.

Yes, that should be a rather straightforward change to include that one in the verbose functionality. The only that needs to be added is , disable=not self.verbose and it should work. If you want, a PR would be appreciated!