Conversion to R fails when adata.var is not empty
climouse opened this issue · 2 comments
Thanks for the great work, it's awesome to be able to go back and forth between scanpy and seurat!
I am having some issues when the anndata object contains a non empty .var slot.
For example:
data #my anndata object
#AnnData object with n_obs × n_vars = 3818 × 5127
# obs: 'gene_symbols', 'type', 'chr', 'mybatch', 'celltype', 'n_genes', 'n_counts', 'n_targets'
# var: 'gene_symbols', 'type', 'chr', 'n_cells', 'n_counts', 'highly_variable', 'means', 'dispersions', 'dispersions_norm'
# uns: 'log1p'
The conversion of this anndata object throws an error
%%R -i data
print(data)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-101-8763aa13d87e> in <module>()
----> 1 get_ipython().run_cell_magic('R', '-i data', 'print(data)')
/share/software/user/open/py-jupyter/1.0.0_py36/lib/python3.6/site-packages/IPython/core/interactiveshell.py in run_cell_magic(self, magic_name, line, cell)
2101 magic_arg_s = self.var_expand(line, stack_depth)
2102 with self.builtin_trap:
-> 2103 result = fn(magic_arg_s, cell)
2104 return result
2105
<decorator-gen-129> in R(self, line, cell, local_ns)
/share/software/user/open/py-jupyter/1.0.0_py36/lib/python3.6/site-packages/IPython/core/magic.py in <lambda>(f, *a, **k)
185 # but it's overkill for just that one bit of state.
186 def magic_deco(arg):
--> 187 call = lambda f, *a, **k: f(*a, **k)
188
189 if callable(arg):
/share/PI/astraigh/miniconda3/envs/scanpy/lib/python3.6/site-packages/rpy2/ipython/rmagic.py in R(self, line, cell, local_ns)
721 raise NameError("name '%s' is not defined" % input)
722 with localconverter(converter) as cv:
--> 723 ro.r.assign(input, val)
724
725 tmpd = self.setup_graphics(args)
/share/PI/astraigh/miniconda3/envs/scanpy/lib/python3.6/site-packages/rpy2/robjects/functions.py in __call__(self, *args, **kwargs)
190 kwargs[r_k] = v
191 return (super(SignatureTranslatedFunction, self)
--> 192 .__call__(*args, **kwargs))
193
194
/share/PI/astraigh/miniconda3/envs/scanpy/lib/python3.6/site-packages/rpy2/robjects/functions.py in __call__(self, *args, **kwargs)
111
112 def __call__(self, *args, **kwargs):
--> 113 new_args = [conversion.py2rpy(a) for a in args]
114 new_kwargs = {}
115 for k, v in kwargs.items():
/share/PI/astraigh/miniconda3/envs/scanpy/lib/python3.6/site-packages/rpy2/robjects/functions.py in <listcomp>(.0)
111
112 def __call__(self, *args, **kwargs):
--> 113 new_args = [conversion.py2rpy(a) for a in args]
114 new_kwargs = {}
115 for k, v in kwargs.items():
/share/PI/astraigh/miniconda3/envs/scanpy/lib/python3.6/functools.py in wrapper(*args, **kw)
805 '1 positional argument')
806
--> 807 return dispatch(args[0].__class__)(*args, **kw)
808
809 funcname = getattr(func, '__name__', 'singledispatch function')
/share/PI/astraigh/miniconda3/envs/scanpy/lib/python3.6/site-packages/anndata2ri/py2r.py in py2rpy_anndata(obj)
57 assays = ListVector({**x, **layers})
58
---> 59 row_args = {k: pandas2ri.py2rpy(v) for k, v in obj.var.items()}
60 if check_no_dupes(obj.var_names, "var_names"):
61 row_args["row.names"] = pandas2ri.py2rpy(obj.var_names)
/share/PI/astraigh/miniconda3/envs/scanpy/lib/python3.6/site-packages/anndata2ri/py2r.py in <dictcomp>(.0)
57 assays = ListVector({**x, **layers})
58
---> 59 row_args = {k: pandas2ri.py2rpy(v) for k, v in obj.var.items()}
60 if check_no_dupes(obj.var_names, "var_names"):
61 row_args["row.names"] = pandas2ri.py2rpy(obj.var_names)
/share/PI/astraigh/miniconda3/envs/scanpy/lib/python3.6/functools.py in wrapper(*args, **kw)
805 '1 positional argument')
806
--> 807 return dispatch(args[0].__class__)(*args, **kw)
808
809 funcname = getattr(func, '__name__', 'singledispatch function')
/share/PI/astraigh/miniconda3/envs/scanpy/lib/python3.6/site-packages/rpy2/robjects/pandas2ri.py in py2rpy_pandasseries(obj)
124 continue
125 if type(x) is not homogeneous_type:
--> 126 raise ValueError('Series can only be of one type, or None.')
127 # TODO: Could this be merged with obj.type.name == 'O' case above ?
128 res = {
ValueError: Series can only be of one type, or None.`
Now if I remove the .var slot in that same anndata object the conversion works
data_novars=annd.AnnData(X=data.X, obs=data.obs, var=None, uns=data.uns, raw=data.raw)
data_novars
#AnnData object with n_obs × n_vars = 3818 × 5127
# obs: 'gene_symbols', 'type', 'chr', 'mybatch', 'celltype', 'n_genes', 'n_counts', 'n_targets'
# uns: 'log1p'
%%R -i data_novars
print(data_novars)
Returns this as expected
class: SingleCellExperiment
dim: 5127 3818
metadata(1): log1p
assays(1): X
rownames(5127): 0 1 ... 5125 5126
rowData names(0):
colnames(3818): ENSG00000251562 ENSG00000202198 ... ENSG00000240710
ENSG00000230928
colData names(8): gene_symbols type ... n_counts n_targets
reducedDimNames(0):
spikeNames(0):
I am using
scanpy==1.4.5.1 anndata==0.7.1 umap==0.3.10 numpy==1.18.1 scipy==1.4.1 pandas==1.0.1 scikit-learn==0.22.1 statsmodels==0.11.0 python-igraph==0.7.1 louvain==0.6.1
And on the R side:
Seurat = 3.0.2
Hi! Happy you enjoy it!
I assume it’s a categorical column causing the error and you don’t have anndata2ri v1.0.2. If I’m right, this is a duplicate of #39 and you can fix it by updating to 1.0.2.
If not, please continue:
“Non-empty” isn’t the issue: as you see, .obs
converts fine, and it uses the exact same conversion function. It’s one or more of the .var
columns, and you have to find out which and why.
I use rpy2’s pandas2ri
to convert obs
’ and var
’s columns. Therefore I assume that unless it has to do with the rpy2 converters I activate, it’s an rpy2 issue or some column type that’s weird enough to not be supported.
Which .var
column(s) cause(s) the error? What dtype does it / do they have? Is it / are they categorical?
So I’m going to assume you had an old version and this is resolved. If not, please comment here and tell me what I asked for in the previous comment.