CannyLab/tsne-cuda

Aborted (core dumped) without any other error messages

JXuann opened this issue · 1 comments

Hi,

After installing tsnecuda via conda in a fresh conda environment, I could run tsnecuda.test() and fit on fit_transform() on some randomly generated data points (as another test) without errors.

But when I run it on datapoints from my dataset, for example, of shape (4368, 32768), the code ended with Aborted (core dumped) without any other error messages.

With the debugger, I could identify it stopped at self._lib.pymodule_tsne(*tsne_args) at line 219 of TSNE.py after iterating through a couple of tsne_args. Process finished with exit code 134 (interrupted by signal 6: SIGABRT) eventually showed up at the console.

tsne_args at line 217 are:
TsneConfig(result=array([[0., 0.], [0., 0.], [0., 0.], ..., [0., 0.], [0., 0.], [0., 0.]], dtype=float32), points=array([[ 7.0130308e-03, -7.7809277e-03, -7.2167125e-03, ..., 9.1743544e-03, 6.3131857e-03, 6.1482363e-03], [-4.8844217e-05, -1.2296567e-02, -1.2387870e-02, ..., 2.9594781e-02, 2.9431544e-02, 2.8699704e-02], [ 1.3354383e-03, -4.5787361e-03, -4.9447054e-03, ..., -1.1602382e-03, -9.6860871e-04, -7.4672548e-04], ..., [ 2.6103533e-03, -1.7379832e-03, -2.2492162e-03, ..., 1.9836126e-02, 2.0526893e-02, 2.0580655e-02], [ 7.3855170e-03, -2.6477885e-03, -1.6626902e-03, ..., 1.5038115e-02, 8.5423589e-03, 3.8215318e-03], [-2.0392529e-08, -1.7250004e-03, -2.1863466e-03, ..., 1.9540899e-02, 1.8518852e-02, 1.9700618e-02]], dtype=float32), dims=<numpy.core._internal.c_long_Array_2 object at 0x7fc0ab560340>, perplexity=c_float(15.0), learning_rate=c_float(10.0), early_exaggeration=c_float(12.0), magnitude_factor=c_float(5.0), num_neighbors=c_int(32), iterations=c_int(1000), iterations_no_progress=c_int(1000), force_magnify_iters=c_int(250), perplexity_search_epsilon=c_float(0.0010000000474974513), pre_exaggeration_momentum=c_float(0.5), post_exaggeration_momentum=c_float(0.800000011920929), theta=c_float(0.5), epssq=c_float(0.0024999999441206455), min_gradient_norm=c_float(0.0), initialization_type=c_int(1), preinit_data=array([[0.]], dtype=float32), dump_points=c_bool(False), dump_file=array([100, 117, 109, 112, 46, 116, 120, 116, 0], dtype=uint8), dump_interval=c_int(1), use_interactive=c_bool(False), viz_server=array([116, 99, 112, 58, 47, 47, 108, 111, 99, 97, 108, 104, 111, 115, 116, 58, 53, 53, 53, 54, 0], dtype=uint8), viz_timeout=c_int(10000), verbosity=c_int(0), print_interval=c_int(10), gpu_device=c_int(0), return_style=c_int(0), num_snapshots=c_int(5), distance_metric=c_int(1))

The last successfully processed input to self._lib.pymodule_tsne() is viz_server=array([116, 99, 112, 58, 47, 47, 108, 111, 99, 97, 108, 104, 111, 115, 116, 58, 53, 53, 53, 54, 0], dtype=uint8). After this, the program ended.

I have no idea of what actually went wrong and could not find any concrete solutions online. Any suggestions from you would be highly appreciated!

Hmm. This is a bit odd. Does it work with a randomly generated array of the same size? If so, are there any duplicated points in your training data (I think there was a bug with this a while ago that we fixed).