DmitryUlyanov/Multicore-TSNE

'No, no this should not happen' Happens

madkoppa opened this issue · 15 comments

Title is pretty self explanatory. I used your implementation a few weeks ago successfully and everything was perfect, but now when i installed this on another machine after a few iterations it starts spamming that particular error message. I have tried installing everything from scratch and nothing seems to work. I am using the same data as with the other machines.

Has something been changed? Its pretty silly, but this is the only TSNE implementation that i can find that wont take me a day per attempt.

I checked out an earlier version, the commit with the merge of pull request #31 and it works fine. So somewhere in between then and now this error has arisen. It is usually happens between 200-1000 iterations if that helps at all.

EDIT: Also the number of cores does not affect this bug, it still happens even with 1.

I will try to handle this next week.

It happens to me sometimes as well. When the perplexity is high enough.

I also see this error message (thousands of times), but the TSNE computation actually seems to finish normally nevertheless.

@asanakoy can you please elaborate? Are you saying that you can successfully avoid the error by setting the perplexity parameter appropriately?

Hi, fixed a bug here f5c5be1#diff-f8b3cce0b3183b4e02d913d9eb933c6eR210

Can you please do git pull and check if it solved the issue?

@DmitryUlyanov, thank you. I will write you back when I try the update.

@DmitryUlyanov I have reinstalled after doing a git pull, but I still get that message. If it helps, my matrix is very sparse [sparsity = 0.012031, Dimensions are: 2292 rows, 514 columns] and the perplexity I am using is 5.

@DmitryUlyanov, I have encountered this warning message as well. And as @dietmar said, TSNE finished normally. Could you let me know if this message actually affect the TSNE results?

Thanks!

Hello @DmitryUlyanov,
I'm trying to use the TSNE and I get the same error!
I have a 136500x1120 (sparsity = 0.001129) dataset and I am trying to run it this settings: using 48 cores, no_dims = 2, perplexity = 30.000000, and theta = 0.500000

TSNE builds the tree no problem, and then and a random iteration the 'No, no this should not happen' happens and never goes away.

I tested generating a random matrix of the same size and it works until the end without errors and running a subsection of my data (1000x1120) causes the same problem.

Are there any updates related to this issue?

So @Fistr @DmitryUlyanov Is this a warning or an error ? Because it manages to run to the end and do all iterations but I don't know if I should trust the data.

Thanks!

Adding a very small amount of noise to the data seems to solve the issue.
Maybe it's just related to the amount of zeros in the dataset.

@DmitryUlyanov, f5c5be1#diff-f8b3cce0b3183b4e02d913d9eb933c6eR210 didn't fix it.

I believe the problem is in numerical instability when the point is being checked to lie in the bounds of the child splittree.cpp#L149- > splittree.cpp#L110.

Hi @DmitryUlyanov Just encountered this issue as well. Adding noise, as suggested by @AlexandreLaborde mat_noisy = mat + np.random.normal(loc=0,scale=0.001) didn't help in my case.

Hi @jayant91089, I just want to let you know that I added the noise exactly in the same way and with the same scale.
My dataset has a lot of padding made by adding zeros, do you have a lot of zeros as well?

Can confirm, happens to me too. If this is actually irrelevant, an option to suppress warnings instead of printing them would be nice.

I get same warning when cloned from latest master, when use 'MulticoreTSNE' on default settings:

    tsne = MulticoreTSNE(n_jobs=multiprocessing.cpu_count(), random_state=0)
    Xpr = tsne.fit_transform(X)