compute_clustering_pure_python crashed

Question

compute_clustering_pure_python crashed

Closed this issue 2 years ago · 3 comments

Command:
data.compute_clustering_pure_python(Z=1.65)

Output:

Clustering started
Number of clusters before multimodality test= 45
1.97 seconds clustering before multimodality test
0.07 seconds identifying the borders
0.15 seconds with multimodality test

IndexError Traceback (most recent call last)
in
----> 1 data.compute_clustering_pure_python(Z=1.65)

~/Code/DULY/duly/clustering.py in compute_clustering_pure_python(self, Z, halo)

    346                         kk = nnum[k]
    347                         log_den_bord_m[jj][kk] = log_den_bord[j][k]
--> 348                         log_den_bord_err[jj][kk] = log_den_bord_err[j][k]
    349         Last_cls = np.empty(N, dtype=int)
    350         for j in range(N_clusters):

IndexError: index 25 is out of bounds for axis 0 with size 2

Answer 1 · 2021-10-07T12:43:25.000Z

I found the bug. Although I have not pushed it yet, since I still have to pull the latest modifications and DADpy and all that. The bug was at line 341:

should have been
log_den_bord_err_m = np.zeros((N_clusters, N_clusters), dtype=float)

instead of
log_den_bord_err = np.zeros((N_clusters, N_clusters), dtype=float)

NOW compute_clustering_pure_python() gives the same results as Fortran, so for me it is the stable version! While compute_clustering() still crashes on some databases. I suspect there is a +1 error somewhere in the cython code coming from an error in the translation the different convention between Python counting in array indices (from 0) and Fortran (from 1). But I don't know where, since cython debugging takes long

Answer 2 · 2022-05-17T15:49:28.000Z

Issue addressed in last commits by @alexdepremia before release. Consider closing it

Answer 3 · 2022-07-19T08:16:01.000Z

Thanks @imacocco I think this was addressed in the last months, I will close this