ventolab/CellphoneDB

KeyError: 'VEGFA_FLT1_complex'

PattF opened this issue · 9 comments

PattF commented

I'm fairly new to working with scrnaseq data. I'm trying to use cellphonedb on a treated vs. untreated drug dataset that was processed using Parse. I generated my metadata and expression matrix files but when attempting to run Method 2, I got the error attached below.
Appreciate any help in where I went wrong. Can send whichever files are needed to try debug my error. Thanks beforehand!

Reading user files...
The following user files were loaded successfully:
C:/data/expression_matrix_2.csv
C:/data/metadata_1.csv
[ ][CORE][14/11/23-16:35:09][INFO] [Cluster Statistical Analysis] Threshold:0.1 Iterations:1000 Debug-seed:42 Threads:5 Precision:3
[ ][CORE][14/11/23-16:35:09][WARNING] Debug random seed enabled. Set to 42
[ ][CORE][14/11/23-16:35:10][INFO] Running Real Analysis
[ ][CORE][14/11/23-16:35:10][INFO] Running Statistical Analysis
100%|██████████████████████████████████████████████████████████████████████████████| 1000/1000 [00:35<00:00, 28.22it/s]
[ ][CORE][14/11/23-16:35:46][INFO] Building Pvalues result
[ ][CORE][14/11/23-16:35:46][INFO] Building results

[ ][CORE][14/11/23-16:35:47][INFO] Scoring interactions: Filtering genes per cell type..
100%|█████████████████████████████████████████████████████████████████████████████████| 32/32 [00:00<00:00, 289.31it/s]
[ ][CORE][14/11/23-16:35:47][INFO] Scoring interactions: Calculating mean expression of each gene per group/cell type..

100%|█████████████████████████████████████████████████████████████████████████████████| 32/32 [00:00<00:00, 668.78it/s]
C:\Users\anaconda3\Lib\site-packages\cellphonedb\utils\scoring_utils.py:103: RuntimeWarning:
invalid value encountered in power

[ ][CORE][14/11/23-16:35:47][INFO] Scoring interactions: Calculating scores for all interactions and cell types..
100%|██████████████████████████████████████████████████████████████████████████████| 1024/1024 [00:21<00:00, 48.13it/s]
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
Cell In[38], line 11
      8 from cellphonedb.src.core.methods import cpdb_statistical_analysis_method
     10 # Call the method with adjusted file paths
---> 11 cpdb_results = cpdb_statistical_analysis_method.call(
     12     cpdb_file_path=cpdb_file_path,
     13     meta_file_path=meta_file_path,
     14     counts_file_path=counts_file_path,
     15     counts_data='hgnc_symbol',
     16     # Omitting the optional files active_tfs_file_path and microenvs_file_path
     17     score_interactions=True,
     18     iterations=1000,
     19     threshold=0.1,
     20     threads=5,
     21     debug_seed=42,
     22     result_precision=3,
     23     pvalue=0.05,
     24     subsampling=False,
     25     subsampling_log=False,
     26     subsampling_num_pc=100,
     27     subsampling_num_cells=1000,
     28     separator='|',
     29     debug=False,
     30     output_path=out_path,
     31     output_suffix=None
     32 )

File ~\anaconda3\Lib\site-packages\cellphonedb\src\core\methods\cpdb_statistical_analysis_method.py:157, in call(cpdb_file_path, meta_file_path, counts_file_path, counts_data, output_path, microenvs_file_path, active_tfs_file_path, iterations, threshold, threads, debug_seed, result_precision, pvalue, subsampling, subsampling_log, subsampling_num_pc, subsampling_num_cells, separator, debug, output_suffix, score_interactions)
    154 if score_interactions:
    155     # Make sure all cell types are strings
    156     meta['cell_type'] = meta['cell_type'].apply(str)
--> 157     interaction_scores = scoring_utils.score_interactions_based_on_participant_expressions_product(
    158         cpdb_file_path, counts4scoring, means_result.copy(), separator, meta, threshold, "cell_type", threads)
    159     analysis_result['interaction_scores'] = interaction_scores
    161 file_utils.save_dfs_as_tsv(output_path, output_suffix, "statistical_analysis", analysis_result)

File ~\anaconda3\Lib\site-packages\cellphonedb\utils\scoring_utils.py:344, in score_interactions_based_on_participant_expressions_product(cpdb_file_path, counts, means, separator, metadata, threshold, cell_type_col_name, threads)
    340 cpdb_fms = scale_expression(cpdb_fmsh,
    341                             upper_range=10)
    343 # Step 5: calculate the ligand-receptor score.
--> 344 interaction_scores = score_product(matrix=cpdb_fms,
    345                                    means=means,
    346                                    separator=separator,
    347                                    interactions=interactions,
    348                                    id2name=id2name,
    349                                    threads=threads)
    350 return interaction_scores

File ~\anaconda3\Lib\site-packages\cellphonedb\utils\scoring_utils.py:290, in score_product(matrix, interactions, means, separator, id2name, threads)
    288 for ct_pair, lr_scores_filtered in results:
    289     interacting_pair2score = dict(zip(lr_scores_filtered['interacting_pair'], lr_scores_filtered['score']))
--> 290     interaction_scores[ct_pair] = [interacting_pair2score[id] for id in interaction_scores['interacting_pair']]
    292 return interaction_scores

File ~\anaconda3\Lib\site-packages\cellphonedb\utils\scoring_utils.py:290, in <listcomp>(.0)
    288 for ct_pair, lr_scores_filtered in results:
    289     interacting_pair2score = dict(zip(lr_scores_filtered['interacting_pair'], lr_scores_filtered['score']))
--> 290     interaction_scores[ct_pair] = [interacting_pair2score[id] for id in interaction_scores['interacting_pair']]
    292 return interaction_scores

KeyError: 'VEGFA_FLT1_complex'

Hi PattF,

Thank you for using CellphoneDB and for your inquiry. Would you mind sending a link to the files you used in the analysis to contact@cellphonedb.org? I will then take a closer look and get back to you. Many thanks.

Best,

Robert.

PattF commented

Thanks Robert! Just sent an email with the requested files.
best,
Patrick

Hi Patrick,

Thanks for sharing your input files with us. I notice negative values in your counts file - from this I infer that you may have scaled the counts before analysing them with CellphoneDB. The negative counts values is what is causing the above error. The counts should be normalised but not scaled before submitting to CellphoneDB. Hope this helps.

Best,

Robert.

PattF commented

Thanks for checking Robert! Right, so I've processed my expression matrix input the wrong way. So it can't be scaled, it should be normalized, and what if its been logarithmized as well?
Can the file include any form of preprocessing/filtering (other than normalization)?

Thanks!
Patrick

Hi Patrick,

The error is thrown by the scoring functionality (c.f. score_interactions=True in your cpdb_statistical_analysis_method.call() above). https://cellphonedb.readthedocs.io/en/latest/RESULTS-DOCUMENTATION.html#method-2-statistical-inference-of-interaction-specificity advises you the following: 'To score interactions, CellphoneDB requires log-normalized expression data, any normalisation procedure (i.e. z-scaling) that transforms zeros to any other value must be avoided.'

Essentially your counts data cannot be negative, e.g. Seurat's LogNormalize function (see: https://satijalab.org/seurat/reference/normalizedata) outputs non-negative counts.

Best,

Robert.

Hi Robert,
Apologies for the late reply, thought I had posted a response.
Can I send an email with the output I generated and also ask some questions about how to setup the initial run?
Happy holidays!

best,
Patrick

Hi Patrick,

I'm afraid I may not be able to comment on any steps prior to CellphoneDB analysis but feel free to ask me about any issues that occur during the analysis using the package. Hope that helps.

Best,

Robert

Thanks Robert!
I sent you a quick email about it all with a data link.
best,
Patrick

Help was provided to the user on CellphoneDB use via email.