Running `celltypist.annotate` with `min_prop` can't create "Heterogeneous" category
DanScarc opened this issue · 2 comments
DanScarc commented
Description
Please find a minimal example reproducing the error below.
Example
# Library imports
import scanpy as sc
import celltypist # v. 1.6.1
from celltypist import models
# Data loading
adata = sc.datasets.pbmc3k()
# Adapt adata for compatibility with celltypist
adata_celltypist = adata.copy()
sc.pp.normalize_per_cell(
adata_celltypist, counts_per_cell_after=10**4
)
sc.pp.log1p(adata_celltypist)
adata_celltypist.X = adata_celltypist.X.toarray()
# Dowload celltypist models
models.download_models(
force_update=True, model=["Immune_All_Low.pkl"]
)
model_low = models.Model.load(model="Immune_All_Low.pkl")
# Predict cell types
predictions_low = celltypist.annotate(
adata_celltypist, model=model_low, majority_voting=True, mode="best match", min_prop=0.7
)
Returns
File ~/miniforge3/envs/preprocessing/lib/python3.9/site-packages/celltypist/classifier.py:473, in Classifier.majority_vote(predictions, over_clustering, min_prop)
471 majority = votes.idxmax(axis=0)
472 freqs = (votes / votes.sum(axis=0).values).max(axis=0)
--> 473 majority[freqs < min_prop] = 'Heterogeneous'
474 majority = majority[over_clustering].reset_index()
475 majority.index = predictions.predicted_labels.index
.
.
.
TypeError: Cannot setitem on a Categorical with a new category (Heterogeneous), set the categories first
Environment
My current environment is:
name: preprocessing
channels:
- bioconda
- conda-forge
dependencies:
- conda-forge::jupyterlab=3.5.0
- conda-forge::leidenalg=0.9.1
- conda-forge::numba=0.56.4
- conda-forge::joypy
- conda-forge::python=3.9.15
- conda-forge::r-base=4.1.3
- conda-forge::r-soupx=1.6.1
- conda-forge::r-sctransform=0.3.3
- conda-forge::r-glmpca=0.2.0
- conda-forge::rpy2=3.5.11
- conda-forge::scanpy=1.9.3
- conda-forge::session-info=1.0.0
- bioconda::celltypist
- bioconda::anndata2ri=1.1
- bioconda::bioconductor-scdblfinder=1.8.0
- bioconda::bioconductor-scry=1.6.0
- bioconda::bioconductor-scran=1.22.1
- bioconda::bioconductor-glmgampoi=1.6.0
Thank you in advance!
ChuanXu1 commented
@DanScarc, this should be caused by the new behavior of new versions of pandas that make the output of idxmax as categorical. You can try to downgrade your version of pandas, or use the newest version of celltypist (1.6.2) which should have fixed this issue.
ChuanXu1 commented
This should have been fixed. Please reopen the issue if you have further questions.