morris-lab/CellOracle

IndexError from get_links()

Closed this issue · 2 comments

Thank you for developing such insightful tools for bioinformatics.

But I had Indexerror like while I running

links = oracle.get_links(cluster_name_for_GRN_unit="louvain", alpha=10,
verbose_level=10)

Error message below
###############################################
links = oracle.get_links(cluster_name_for_GRN_unit="louvain", alpha=10,
verbose_level=10)
0%| | 0/25 [00:00<?, ?it/s]
Inferring GRN for 0...
0%| | 0/2394 [00:00<?, ?it/s]
'''Traceback (most recent call last):

File "/tmp/ipykernel_58099/741005556.py", line 1, in <cell line: 1>
links = oracle.get_links(cluster_name_for_GRN_unit="louvain", alpha=10,

File "/home/gt/anaconda3/envs/ATAC/lib/python3.8/site-packages/celloracle/trajectory/oracle_core.py", line 1467, in get_links
links = get_links(oracle_object=self,

File "/home/gt/anaconda3/envs/ATAC/lib/python3.8/site-packages/celloracle/network_analysis/network_construction.py", line 74, in get_links
linkLists = _fit_GRN_for_network_analysis(oracle_object, cluster_name_for_GRN_unit=cluster_name_for_GRN_unit,

File "/home/gt/anaconda3/envs/ATAC/lib/python3.8/site-packages/celloracle/network_analysis/network_construction.py", line 138, in fit_GRN_for_network_analysis
tn
.fit_All_genes(bagging_number=bagging_number,

File "/home/gt/anaconda3/envs/ATAC/lib/python3.8/site-packages/celloracle/network/net_core.py", line 312, in fit_All_genes
self.fit_genes(target_genes=self.all_genes,

File "/home/gt/anaconda3/envs/ATAC/lib/python3.8/site-packages/celloracle/network/net_core.py", line 422, in fit_genes
coefs = _get_bagging_ridge_coefs(target_gene=target_gene,

File "/home/gt/anaconda3/envs/ATAC/lib/python3.8/site-packages/celloracle/network/regression_models.py", line 130, in get_bagging_ridge_coefs
coefs = _get_coef_matrix(model, reg_all)

File "/home/gt/anaconda3/envs/ATAC/lib/python3.8/site-packages/celloracle/network/regression_models.py", line 145, in get_coef_matrix
[pd.Series(ensemble_model.estimators
[i].coef_,

File "/home/gt/anaconda3/envs/ATAC/lib/python3.8/site-packages/celloracle/network/regression_models.py", line 146, in
index=feature_names[ensemble_model.estimators_features_[i]])
IndexError: index 26 is out of bounds for axis 0 with size 26'''

I'm not sure why such that error came out..

And I asked to structure of problem to Chatgpt to figure out.
I attached them.

Could you have any suggestion to solve this error.
Thanks.

#################################################
Based on the error message and the provided code snippet, here is the flow of inheritance and the final location of the error:

The error is raised while calling the get_links() function from the celloracle/trajectory/oracle_core.py file. Specifically, the error occurs in the _fit_GRN_for_network_analysis() function from the celloracle/network_analysis/network_construction.py file.

The fit_GRN_for_network_analysis() function calls the fit_All_genes() method of a tn object from the celloracle/network/net_core.py file.

The fit_All_genes() method calls the fit_genes() method of the same object.

The fit_genes() method calls the _get_bagging_ridge_coefs() function from the celloracle/network/regression_models.py file.

The _get_bagging_ridge_coefs() function calls the _get_coef_matrix() function from the same file.

The get_coef_matrix() function tries to create a list of Pandas Series objects using a list comprehension. In this list comprehension, it tries to access elements of the feature_names array using the indices from the ensemble_model.estimators_features array.

The error occurs in the list comprehension because the index 26 is out of bounds for the feature_names array, which has a size of 26. This suggests that the ensemble_model.estimators_features_ array contains an index that is larger than or equal to 26, causing the IndexError.

To understand the exact cause of the error, you would need to investigate the values of the ensemble_model.estimators_features_ array and how it is being used in the code. It's possible that there is an issue with how the model is trained or how the indices are generated.

  • I used Paul TF as tutorial , but used new GRN as input.

I found that IndexError is related with duplication of var_names .
That is.. my silly mistake while I edit highly_variable_gene.
Sorry for any bothering. I will close this issue, thanks.