error while calculating latent_time
SinghalDivya opened this issue · 17 comments
Hi,
Thank you for creating an interesting tool. I am running into a following error when I ran latent time. Could you please guide me through this error.
mv.latent_time(adata_result)
computing velocity graph (using 1/12 cores)
100%
4060/4060 [00:07<00:00, 520.87cells/s]
ValueError Traceback (most recent call last)
Cell In[113], line 1
----> 1 mv.latent_time(adata_result)
File /work/gallo_lab/divya/softwares/miniconda3/envs/multi_velo/lib/python3.8/site-packages/multivelo/dynamical_chrom_func.py:3097, in latent_time(adata, vkey, **kwargs)
3095 raise ValueError('Normalized velocity matrix is not found. Please multivelo.run velocity_graph function first.')
3096 if vkey+'_norm_graph' not in adata.uns.keys():
-> 3097 velocity_graph(adata, vkey=vkey, **kwargs)
3098 scv.tl.latent_time(adata, vkey=vkey+'_norm', **kwargs)
File /work/gallo_lab/divya/softwares/miniconda3/envs/multi_velo/lib/python3.8/site-packages/multivelo/dynamical_chrom_func.py:3039, in velocity_graph(adata, vkey, xkey, **kwargs)
3037 if vkey+'_norm_genes' not in adata.var.columns:
3038 adata.var[vkey+'_norm_genes'] = adata.var[vkey+'_genes']
-> 3039 scv.tl.velocity_graph(adata, vkey=vkey+'_norm', xkey=xkey, **kwargs)
File /work/gallo_lab/divya/softwares/miniconda3/envs/multi_velo/lib/python3.8/site-packages/scvelo/tools/velocity_graph.py:363, in velocity_graph(data, vkey, xkey, tkey, basis, n_neighbors, n_recurse_neighbors, random_neighbors_at_max, sqrt_transform, variance_stabilization, gene_subset, compute_uncertainties, approx, mode_neighbors, copy, n_jobs, backend)
359 n_jobs = get_n_jobs(n_jobs=n_jobs)
360 logg.info(
361 f"computing velocity graph (using {n_jobs}/{os.cpu_count()} cores)", r=True
362 )
--> 363 vgraph.compute_cosines(n_jobs=n_jobs, backend=backend)
365 adata.uns[f"{vkey}_graph"] = vgraph.graph
366 adata.uns[f"{vkey}_graph_neg"] = vgraph.graph_neg
File /work/gallo_lab/divya/softwares/miniconda3/envs/multi_velo/lib/python3.8/site-packages/scvelo/tools/velocity_graph.py:175, in VelocityGraph.compute_cosines(self, n_jobs, backend)
172 n_obs = self.X.shape[0]
174 # TODO: Use batches and vectorize calculation of dX in self._calculate_cosines
--> 175 res = parallelize(
176 self._compute_cosines,
177 range(self.X.shape[0]),
178 n_jobs=n_jobs,
179 unit="cells",
180 backend=backend,
181 )()
182 uncertainties, vals, rows, cols = map(_flatten, zip(*res))
184 vals = np.hstack(vals)
File /work/gallo_lab/divya/softwares/miniconda3/envs/multi_velo/lib/python3.8/site-packages/scvelo/core/_parallelize.py:138, in parallelize..wrapper(*args, **kwargs)
126 pbar, queue, thread = None, None, None
128 res = Parallel(n_jobs=n_jobs, backend=backend)(
129 delayed(callback)(
130 *((i, cs) if use_ixs else (cs,)),
(...)
135 for i, cs in enumerate(collections)
136 )
--> 138 res = np.array(res) if as_array else res
139 if thread is not None:
140 thread.join()
ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 2 dimensions. The detected shape was (1, 4) + inhomogeneous part.
####### my adata_result looks like the following :
AnnData object with n_obs × n_vars = 4060 × 718
obs: 'n_counts', 'module'
var: 'Accession', 'Chromosome', 'End', 'Start', 'Strand', 'fit_alpha_c', 'fit_alpha', 'fit_beta', 'fit_gamma', 'fit_t_sw1', 'fit_t_sw2', 'fit_t_sw3', 'fit_scale_cc', 'fit_rescale_c', 'fit_rescale_u', 'fit_alignment_scaling', 'fit_model', 'fit_direction', 'fit_loss', 'fit_likelihood', 'fit_likelihood_c', 'fit_ssd_c', 'fit_var_c', 'fit_c0', 'fit_u0', 'fit_s0', 'fit_anchor_min_idx', 'fit_anchor_max_idx', 'fit_anchor_velo_min_idx', 'fit_anchor_velo_max_idx', 'velo_s_genes', 'velo_u_genes', 'velo_chrom_genes', 'velo_s_norm_genes'
uns: 'module_colors', 'neighbors', 'pca', 'umap', 'velo_chrom_params', 'velo_s_params', 'velo_u_params', 'velo_s_norm_params'
obsm: 'X_pca', 'X_umap'
varm: 'PCs', 'fit_anchor_c', 'fit_anchor_c_sw', 'fit_anchor_c_velo', 'fit_anchor_s', 'fit_anchor_s_sw', 'fit_anchor_s_velo', 'fit_anchor_u', 'fit_anchor_u_sw', 'fit_anchor_u_velo'
layers: 'ATAC', 'Ms', 'Mu', 'ambiguous', 'fit_state', 'fit_t', 'matrix', 'spliced', 'unspliced', 'velo_chrom', 'velo_s', 'velo_u', 'velo_s_norm'
obsp: '_ATAC_conn', '_RNA_conn', 'connectivities', 'distances'
It's not immediately clear to me what the cause of the issue is since it comes from scVelo (a package Multivelo depends on). Can you try scv.tl.velocity_graph(adata_result, vkey='velo_s')
directly?
I don't see the connection between the two yet. What's the output of adata_result.obs['module'].cat.categories
and adata_result.obs['module'].cat.categories.astype(str)
? Is it something reasonable?
Also, which version of pandas did you use?
I am using pandas version:
pd.__version__
'2.0.1'
also the output of both suggested commands are same.
adata_result.obs['module'].cat.categories
Index(['module_1', 'module_2', 'rest'], dtype='object')
adata_result.obs['module'].cat.categories.astype(str)
Index(['module_1', 'module_2', 'rest'], dtype='object')
Can you try installing the last pandas 1.x?
I installed different pandas version, and reran the whole pipeline, still gets the same error.
pd.__version__
'1.5.3'
however, I got a new warning saying:
/work/gallo_lab/divya/softwares/miniconda3/envs/multi_velo/lib/python3.8/site-packages/umap/distances.py:1063: NumbaDeprecationWarning: The 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details.
@numba.jit()
The warning is probably unrelated.
If you run scvelo's pipeline using the same input RNA data of multivelo
scv.tl.recover_dynamics(adata)
scv.tl.velocity(adata, mode='dynamical')
scv.tl.velocity_graph(adata)
Do you get the same error?
tried
scv.tl.recover_dynamics(adata)
recovering dynamics (using 1/12 cores)
100%
180/180 [02:47<00:00, 1.34gene/s]
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[53], line 1
----> 1 scv.tl.recover_dynamics(adata_result)
3 #scv.tl.velocity(adata, mode='dynamical')
4
5 #scv.tl.velocity_graph(adata)
File /work/gallo_lab/divya/softwares/miniconda3/envs/multi_velo/lib/python3.8/site-packages/scvelo/tools/dynamical_model.py:588, in recover_dynamics(data, var_names, n_top_genes, max_iter, assignment_mode, t_max, fit_time, fit_scaling, fit_steady_states, fit_connected_states, fit_basal_transcription, use_raw, load_pars, return_model, plot_results, steady_state_prior, add_key, copy, n_jobs, backend, **kwargs)
585 adata.varm["loss"] = loss
587 if t_max is not False:
--> 588 dm = align_dynamics(adata, t_max=t_max, dm=dm, idx=idx)
590 logg.info(" finished", time=True, end=" " if settings.verbosity > 2 else "\n")
591 logg.hint(
592 "added \n"
593 f" '{add_key}_pars', "
594 f"fitted parameters for splicing dynamics (adata.var)"
595 )
File /work/gallo_lab/divya/softwares/miniconda3/envs/multi_velo/lib/python3.8/site-packages/scvelo/tools/dynamical_model.py:721, in align_dynamics(data, t_max, dm, idx, mode, remove_outliers, copy)
718 if dm is not None and dm.recoverable:
719 dm.m = m[idx]
720 dm.alpha, dm.beta, dm.gamma, dm.pars[:3] = (
--> 721 np.array([dm.alpha, dm.beta, dm.gamma, dm.pars[:3]]) / dm.m[-1]
722 )
723 dm.t, dm.tau, dm.t_, dm.pars[4] = (
724 np.array([dm.t, dm.tau, dm.t_, dm.pars[4]]) * dm.m[-1]
725 )
727 return adata if copy else dm
ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (4,) + inhomogeneous part.
scv.tl.velocity(adata_result, mode='dynamical')
computing velocities
finished (0:00:04) --> added
'velocity', velocity vectors for each individual cell (adata.layers)
scv.tl.velocity_graph(adata_result)
computing velocity graph (using 1/12 cores)
100%
4060/4060 [00:06<00:00, 615.03cells/s]
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[55], line 1
----> 1 scv.tl.velocity_graph(adata_result)
File /work/gallo_lab/divya/softwares/miniconda3/envs/multi_velo/lib/python3.8/site-packages/scvelo/tools/velocity_graph.py:363, in velocity_graph(data, vkey, xkey, tkey, basis, n_neighbors, n_recurse_neighbors, random_neighbors_at_max, sqrt_transform, variance_stabilization, gene_subset, compute_uncertainties, approx, mode_neighbors, copy, n_jobs, backend)
359 n_jobs = get_n_jobs(n_jobs=n_jobs)
360 logg.info(
361 f"computing velocity graph (using {n_jobs}/{os.cpu_count()} cores)", r=True
362 )
--> 363 vgraph.compute_cosines(n_jobs=n_jobs, backend=backend)
365 adata.uns[f"{vkey}_graph"] = vgraph.graph
366 adata.uns[f"{vkey}_graph_neg"] = vgraph.graph_neg
File /work/gallo_lab/divya/softwares/miniconda3/envs/multi_velo/lib/python3.8/site-packages/scvelo/tools/velocity_graph.py:175, in VelocityGraph.compute_cosines(self, n_jobs, backend)
172 n_obs = self.X.shape[0]
174 # TODO: Use batches and vectorize calculation of dX in self._calculate_cosines
--> 175 res = parallelize(
176 self._compute_cosines,
177 range(self.X.shape[0]),
178 n_jobs=n_jobs,
179 unit="cells",
180 backend=backend,
181 )()
182 uncertainties, vals, rows, cols = map(_flatten, zip(*res))
184 vals = np.hstack(vals)
File /work/gallo_lab/divya/softwares/miniconda3/envs/multi_velo/lib/python3.8/site-packages/scvelo/core/_parallelize.py:138, in parallelize.<locals>.wrapper(*args, **kwargs)
126 pbar, queue, thread = None, None, None
128 res = Parallel(n_jobs=n_jobs, backend=backend)(
129 delayed(callback)(
130 *((i, cs) if use_ixs else (cs,)),
(...)
135 for i, cs in enumerate(collections)
136 )
--> 138 res = np.array(res) if as_array else res
139 if thread is not None:
140 thread.join()
ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 2 dimensions. The detected shape was (1, 4) + inhomogeneous part.
Can you run recover_dynamics, velocity, and velocity_graph function all on the original adata object? I noticed that you ran velocity and velocity_graph on multivelo output instead. This is just to check if the error indeed comes from multivelo rather than scvelo. Thanks.
If I understood correctly, I ran the pipeline scvelo using my dataset, and that gave me the output.
scv.tl.recover_dynamics(adata)
recovering dynamics (using 1/12 cores)
finished (0:15:10) --> added
'fit_pars', fitted parameters for splicing dynamics (adata.var)
scv.tl.velocity(adata, mode='dynamical')
scv.tl.velocity_graph(adata)
computing velocities
finished (0:00:32) --> added
'velocity', velocity vectors for each individual cell (adata.layers)
computing velocity graph (using 1/12 cores)
/work/gallo_lab/divya/softwares/miniconda3/envs/scvelo/lib/python3.8/site-packages/scvelo/core/_parallelize.py:138: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
res = np.array(res) if as_array else res
finished (0:00:13) --> added
'velocity_graph', sparse matrix with cosine correlations (adata.uns)
but, when I ran just adata_rna using multivelo
it gave me the same error:
scv.tl.recover_dynamics(adata_rna)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[56], line 1
----> 1 scv.tl.recover_dynamics(adata_rna)
File /work/gallo_lab/divya/softwares/miniconda3/envs/multi_velo/lib/python3.8/site-packages/scvelo/tools/dynamical_model.py:588, in recover_dynamics(data, var_names, n_top_genes, max_iter, assignment_mode, t_max, fit_time, fit_scaling, fit_steady_states, fit_connected_states, fit_basal_transcription, use_raw, load_pars, return_model, plot_results, steady_state_prior, add_key, copy, n_jobs, backend, **kwargs)
585 adata.varm["loss"] = loss
587 if t_max is not False:
--> 588 dm = align_dynamics(adata, t_max=t_max, dm=dm, idx=idx)
590 logg.info(" finished", time=True, end=" " if settings.verbosity > 2 else "\n")
591 logg.hint(
592 "added \n"
593 f" '{add_key}_pars', "
594 f"fitted parameters for splicing dynamics (adata.var)"
595 )
File /work/gallo_lab/divya/softwares/miniconda3/envs/multi_velo/lib/python3.8/site-packages/scvelo/tools/dynamical_model.py:721, in align_dynamics(data, t_max, dm, idx, mode, remove_outliers, copy)
718 if dm is not None and dm.recoverable:
719 dm.m = m[idx]
720 dm.alpha, dm.beta, dm.gamma, dm.pars[:3] = (
--> 721 np.array([dm.alpha, dm.beta, dm.gamma, dm.pars[:3]]) / dm.m[-1]
722 )
723 dm.t, dm.tau, dm.t_, dm.pars[4] = (
724 np.array([dm.t, dm.tau, dm.t_, dm.pars[4]]) * dm.m[-1]
725 )
727 return adata if copy else dm
ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (4,) + inhomogeneous part.
You mean ran just adata_rna using scVelo? scv.tl.recover_dynamics(adata_rna)
is a scVelo function.
What's the difference between adata_rna and adata? That may be what is causing the issue.
This seems to be scVelo issue, see theislab/scvelo#1058. The scVelo team is working on a fix.
I still need to look into it more closely to see if this is a common issue also affecting MultiVelo.
Per scVelo's issue thread, one possible quick fix is to downgrade numpy to 1.23.5. You can try that.
No problem. Issue closed.