Error when working with vame.community()
elenael97 opened this issue · 13 comments
Hello,
I have been trying to work with vame.community() function, and I can produce the hierarchical tree of mouse behaviour, but afterwards I get an error, I was wondering if anyone could help me solve it?
In [5]: vame.community(config, show_umap=False, cut_tree=None)
C:\Users\User\VAME\vame\analysis\community_analysis.py:57: RuntimeWarning: invalid value encountered in true_divide
transition_matrix = adjacency_matrix/row_sum[:,np.newaxis]
C:\Users\User\VAME\vame\analysis\tree_hierarchy.py:67: RuntimeWarning: divide by zero encountered in double_scalars
cost = (motif_norm[i] + motif_norm[j]) / np.abs(transition_matrix[i,j] + transition_matrix[j,i] )
C:\Users\User\VAME\vame\analysis\tree_hierarchy.py:67: RuntimeWarning: invalid value encountered in double_scalars
cost = (motif_norm[i] + motif_norm[j]) / np.abs(transition_matrix[i,j] + transition_matrix[j,i] )
Where do you want to cut the Tree? 0/1/2/3/...2
[[8, 11, 1, 5, 7, 9, 10, 12, 6, 4, 14, 13], [3, 2]]Are all motifs in the list? (yes/no/restart)yes
IndexError Traceback (most recent call last)
in
----> 1 vame.community(config, show_umap=False, cut_tree=None)~\VAME\vame\analysis\community_analysis.py in community(config, show_umap, cut_tree)
204 labels = get_labels(cfg, files, model_name, n_cluster)
205 transition_matrices = compute_transition_matrices(files, labels, n_cluster)
--> 206 communities_all, trees = create_community_bag(files, labels, transition_matrices, cut_tree, n_cluster)
207 community_labels_all = get_community_labels(files, labels, communities_all)
208~\VAME\vame\analysis\community_analysis.py in create_community_bag(files, labels, transition_matrices, cut_tree, n_cluster)
84 for i, file in enumerate(files):
85 _, usage = np.unique(labels[i], return_counts=True)
---> 86 T = graph_to_tree(usage, transition_matrices[i], n_cluster, merge_sel=1)
87 trees.append(T)
88~\VAME\vame\analysis\tree_hierarchy.py in graph_to_tree(motif_usage, transition_matrix, n_cluster, merge_sel)
103 # max_tr = np.max(trans_mat_temp) #merge function
104 # nodes = np.where(max_tr == trans_mat_temp)
--> 105 nodes = merge_func(trans_mat_temp, n_cluster, motif_norm_temp, merge_sel)
106
107 if np.size(nodes) >= 2:~\VAME\vame\analysis\tree_hierarchy.py in merge_func(transition_matrix, n_cluster, motif_norm, merge_sel)
65 for j in range(n_cluster):
66 try:
---> 67 cost = (motif_norm[i] + motif_norm[j]) / np.abs(transition_matrix[i,j] + transition_matrix[j,i] )
68 except ZeroDivisionError:
69 print("Error: Transition probabilities between motif "+str(i)+" and motif "+str(j)+ " are zero.")IndexError: index 14 is out of bounds for axis 0 with size 14
I receive the same error if I choose show_umap=True, which means I cannot view the UMAP.
We are running into the same error when trying the community approach.
We have 7 videos and the corresponding results (csv files) from DLC as inputs into VAME and we have 15 motifs (motif 0 to motif 14) as a result of training the LSTM and doing the pose segmentation steps.
Motif occurrences in each of the video are as follows :
video 0: motifs [ 0 1 5 6 8 10 11]
video 1: motifs [ 0 1 6 7 8 10 11 12]
video 2: motifs [ 0 6 7 10 12]
video 3: motifs [ 1 2 3 4 5 8 9 13 14]
video 4: motifs [ 1 2 3 4 5 9 13 14]
video 5: motifs [ 1 2 3 4 5 8 9 13 14]
video 6: motifs [ 1 2 3 4 5 9 13 14]
Since some motifs are missing in each case, we ran the community function with cut_tree=None
However, we get this error:
IndexError Traceback (most recent call last)
<ipython-input-11-9e257901c003> in <module>
----> 1 vame.community(config, show_umap=False, cut_tree=0)
~/anaconda3/envs/VAME/lib/python3.7/site-packages/vame-1.0-py3.7.egg/vame/analysis/community_analysis.py in community(config, show_umap, cut_tree)
204 labels = get_labels(cfg, files, model_name, n_cluster)
205 transition_matrices = compute_transition_matrices(files, labels, n_cluster)
--> 206 communities_all, trees = create_community_bag(files, labels, transition_matrices, cut_tree, n_cluster)
207 community_labels_all = get_community_labels(files, labels, communities_all)
208
~/anaconda3/envs/VAME/lib/python3.7/site-packages/vame-1.0-py3.7.egg/vame/analysis/community_analysis.py in create_community_bag(files, labels, transition_matrices, cut_tree, n_cluster)
84 for i, file in enumerate(files):
85 _, usage = np.unique(labels[i], return_counts=True)
---> 86 T = graph_to_tree(usage, transition_matrices[i], n_cluster, merge_sel=1)
87 trees.append(T)
88
~/anaconda3/envs/VAME/lib/python3.7/site-packages/vame-1.0-py3.7.egg/vame/analysis/tree_hierarchy.py in graph_to_tree(motif_usage, transition_matrix, n_cluster, merge_sel)
103 # max_tr = np.max(trans_mat_temp) #merge function
104 # nodes = np.where(max_tr == trans_mat_temp)
--> 105 nodes = merge_func(trans_mat_temp, n_cluster, motif_norm_temp, merge_sel)
106
107 if np.size(nodes) >= 2:
~/anaconda3/envs/VAME/lib/python3.7/site-packages/vame-1.0-py3.7.egg/vame/analysis/tree_hierarchy.py in merge_func(transition_matrix, n_cluster, motif_norm, merge_sel)
65 for j in range(n_cluster):
66 try:
---> 67 cost = (motif_norm[i] + motif_norm[j]) / np.abs(transition_matrix[i,j] + transition_matrix[j,i] )
68 except ZeroDivisionError:
69 print("Error: Transition probabilities between motif "+str(i)+" and motif "+str(j)+ " are zero.")
IndexError: index 7 is out of bounds for axis 0 with size 7
In the VAME codebase, vame/analysis/tree_hierarchy.py, the argument motif_norm
passed into the function merge_func
seems to have length < n_clusters, and this seems to throw out the error. motif_norm
is calculated from a usage
variable which is a count of how many occurrences of each of the unique motifs is found in each of the videos. Whenever certain motifs are missing from video, the length of the usage array is shorter than the 'n_clusters'.
For example, if video number 2 in my dataset has motifs video 2: motifs [ 0 6 7 10 12]
, usage is an numpy array like [1930, 2237, 2152, 1737, 941]
where each element corresponds to the count in each of the 5 motifs seen in the video out of the 15 motifs.
motif_norm is in essence calculated as follows ( changed some variable numbers for debugging):
motif_usage_temp_colsum = usage.sum(axis=0)
motif_norm = usage/motif_usage_temp_colsum
motif_norm_temp = motif_norm.copy()
motif_norm
Thus motif_norm
will have same length as of usage
, i.e, 5 is my example.
Now, inside the merge_func
, in the for loop, motif_norm[i] + motif_norm[j]
where i
and j
are in the range 0 to n_clusters
, a 5 element array is indexed with indices 0 to 14, which is what is seeming to throw the error.
@AthiraDK I am also running into the same error when working with the community() function. Did you find a fix for this? I also noticed that motif_norm is Is motif_norm a 1D array with size of ncluster.
@PatrickHonma I have not yet found a fix for this. Do you also have a case where motifs are unevenly distributed over the videos? As in, not all motifs are present in each of your videos?
@AthiraDK I am getting the same indexing error with vame.community(), and can confirm that not all motifs are found in all my videos. Tried cut_tree = None|0|1|3.
Anaconda environment in Windows 10.
@AthiraDK Not all motifs are present in each video, but they are appearing as 0's in the motif_usage.
I did notice that the cost function in line 67 of tree_hierarchy.py is expecting [i,j] dimensions for motif_norm. I added a new dimension to motif_usage_temp by editing:
motif_usage_temp = motif_usage[:,np.newaxis]
Now motif_norm has shape [n_cluster, 1], which I believe fixes the the index error.
@AthiraDK for the 'Index 7 is out of bounds for axis 0 with size 7' that should have been fixed in my pull request:
#58 which ensures that all clusters are accounted for and that 'empty' clusters have a 0 instead of shortening the length of the axis. So update your vame install since that PR was merged a while back, and @PatrickHonma mentions there are 0s in the motif_usage array for him. It is noteworthy that not only did this cause that error, it also was causing data to be inaccurate because all of the clusters above the empty one were being shifted down one number because a 0 wasn't being inserted into the list for an empty cluster (so motif 8 would be stored as motif 7, 9 as 8, etc).
I would clone this repository and try importing from there instead of the version installed from pip. Personally I would recommend cloning my fork of this repository but there are some other features that are different that you may or may not like.
Hello,
I recently started using VAME and did a couple of tests using just one videofile and it seemd to be working fine. Now, I've followed the exact same steps but this time using 6 files. Everything was going well until I found the error "index 29 is out of bounds for axis 0 with size 29" when running vame.community(). I've replaced my files with the more recent ones from https://github.com/LINCellularNeuroscience/VAME/tree/master/vame/analysis but that does not fix the issue.
Here it is the code with the error I got:
In [3]: vame.community(config, show_umap=True, cut_tree=None)
C:\Users\oasis\anaconda3\envs\vame\lib\site-packages\vame-1.0-py3.7.egg\vame\analysis\community_analysis.py:57: RuntimeWarning: invalid value encountered in true_divide
C:\Users\oasis\anaconda3\envs\vame\lib\site-packages\vame-1.0-py3.7.egg\vame\analysis\tree_hierarchy.py:67: RuntimeWarning: divide by zero encountered in double_scalars
IndexError Traceback (most recent call last)
in
----> 1 vame.community(config, show_umap=True, cut_tree=None)
~\anaconda3\envs\vame\lib\site-packages\vame-1.0-py3.7.egg\vame\analysis\community_analysis.py in community(config, show_umap, cut_tree)
204 labels = get_labels(cfg, files, model_name, n_cluster)
205 transition_matrices = compute_transition_matrices(files, labels, n_cluster)
--> 206 communities_all, trees = create_community_bag(files, labels, transition_matrices, cut_tree, n_cluster)
207 community_labels_all = get_community_labels(files, labels, communities_all)
208
~\anaconda3\envs\vame\lib\site-packages\vame-1.0-py3.7.egg\vame\analysis\community_analysis.py in create_community_bag(files, labels, transition_matrices, cut_tree, n_cluster)
84 for i, file in enumerate(files):
85 _, usage = np.unique(labels[i], return_counts=True)
---> 86 T = graph_to_tree(usage, transition_matrices[i], n_cluster, merge_sel=1)
87 trees.append(T)
88
~\anaconda3\envs\vame\lib\site-packages\vame-1.0-py3.7.egg\vame\analysis\tree_hierarchy.py in graph_to_tree(motif_usage, transition_matrix, n_cluster, merge_sel)
103 # max_tr = np.max(trans_mat_temp) #merge function
104 # nodes = np.where(max_tr == trans_mat_temp)
--> 105 nodes = merge_func(trans_mat_temp, n_cluster, motif_norm_temp, merge_sel)
106
107 if np.size(nodes) >= 2:
~\anaconda3\envs\vame\lib\site-packages\vame-1.0-py3.7.egg\vame\analysis\tree_hierarchy.py in merge_func(transition_matrix, n_cluster, motif_norm, merge_sel)
65 for j in range(n_cluster):
66 try:
---> 67 cost = (motif_norm[i] + motif_norm[j]) / np.abs(transition_matrix[i,j] + transition_matrix[j,i] )
68 except ZeroDivisionError:
69 print("Error: Transition probabilities between motif "+str(i)+" and motif "+str(j)+ " are zero.")
IndexError: index 29 is out of bounds for axis 0 with size 29`
I would really appreciate any help!
Best wishes,
Carlos
I am getting the same error as @cfernandezpa and @AthiraDK, though I use the latest version of files. Could anyone find a working solution? Thanks!
Hi everyone, thank you again for bringing this up. We will try to update a newer version of the vame.community() functionality within the next months. I know this is a persisting issue and I was working with other groups on some solutions to this. The tree as well as the community functionality could be overall improved and I am happy for any further ideas from the VAME community in this direction. For now, I will close this issue and hope the newer version will solve some of the outlined problems.
Cheers,
Kevin
@AthiraDK I had the same issue here and here is what I did. A naive fix for this issue is to add a new parameter k_labels
, which is for example, [ 0 6 7 10 12]
, to merge_func
. To do this, you need to create this k_labels
as a parameter in community
in file community_analysis.py
k_labels, usage = np.unique(labels[idx], return_counts=True)
T = graph_to_tree(usage, k_labels, transition_matrices[idx], n_cluster, merge_sel=1)
trees.append(T)
Then, in tree_hierarchy.py
, I modified grah_to_tree
as def graph_to_tree(motif_usage, k_labels, transition_matrix, n_cluster, merge_sel=1)
(basically just feeding in k_labels
to it so that I can use it in merge_func
)
And then inside grah_to_tree
, I fed k_labels to merge_fuc
nodes = merge_func(trans_mat_temp, k_labels, n_cluster, motif_norm_temp, merge_sel)
Here is my merge_func
if merge_sel == 1:
count = 0
cost_temp = 100
for i in range(len(k_labels)):
for j in range(len(k_labels)):
if np.abs(transition_matrix[i,j] + transition_matrix[j,i] ) == 0:
cost = 1000
count += 1
else:
cost = (motif_norm[i] + motif_norm[j]) / np.abs(transition_matrix[i, j] + transition_matrix[j, i])
Now the transition_matrix[i, j]
would be the transition probability between the valid motifs.
@ZhanqiZhang66 I am trying to recreate your fix, but I cannot follow it completely. Could you upload your community_analisys.py
and tree_hierarchy.py
files somewhere? Thanks in advance!