bmabey/pyLDAvis

`tsne` won't work with `sklearn.prepare`

isinyaaa opened this issue · 1 comments

When running the same steps with the same data on colab I get pretty good results with tsne, but locally (probably because of the Python version) I'm not able to run pyLDAvis.sklearn.prepare as I get ValueError: perplexity must be less than n_samples.

I know that colab is running 3.7 and locally I got 3.10 . I also know that both use pyLDAvis version 3.3.1 so its probably broken because of a Scikit update.

I was able to get it to work by manually setting a perplexity value in the TSNE object initialization under pyLDAvis/_prepare.py but it sure isn´t optimal.

Error log
Traceback (most recent call last):
  File "/home/isinyaaa/projects/foss-gpgpu-stack/analyze.py", line 418, in <module>
    main(args)
  File "/home/isinyaaa/projects/foss-gpgpu-stack/analyze.py", line 347, in main
    args.workers).run(vectorizer, processed_data)
  File "/home/isinyaaa/projects/foss-gpgpu-stack/analyze.py", line 128, in run
    self.save_result_as_html(model, data, vectorizer)
  File "/home/isinyaaa/projects/foss-gpgpu-stack/analyze.py", line 148, in save_result_as_html
    super().save_result_as_html(prepare, model, data, vectorizer, mds='tsne')
  File "/home/isinyaaa/projects/foss-gpgpu-stack/analyze.py", line 111, in save_result_as_html
    LDAvis_prepared = prepare(*args, **kwargs)
  File "/home/isinyaaa/.local/lib/python3.10/site-packages/pyLDAvis/sklearn.py", line 95, in prepare
    return pyLDAvis.prepare(**opts)
  File "/home/isinyaaa/.local/lib/python3.10/site-packages/pyLDAvis/_prepare.py", line 443, in prepare
    topic_coordinates = _topic_coordinates(mds, topic_term_dists, topic_proportion, start_index)
  File "/home/isinyaaa/.local/lib/python3.10/site-packages/pyLDAvis/_prepare.py", line 192, in _topic_coordinates
    mds_res = mds(topic_term_dists)
  File "/home/isinyaaa/.local/lib/python3.10/site-packages/pyLDAvis/_prepare.py", line 167, in js_TSNE
    return model.fit_transform(dist_matrix)
  File "/home/isinyaaa/.local/lib/python3.10/site-packages/sklearn/manifold/_t_sne.py", line 1122, in fit_transform
    self._check_params_vs_input(X)
  File "/home/isinyaaa/.local/lib/python3.10/site-packages/sklearn/manifold/_t_sne.py", line 793, in _check_params_vs_input
    raise ValueError("perplexity must be less than n_samples")
ValueError: perplexity must be less than n_samples

see #239
see #235