Exact p-value output file?
ccl6 opened this issue · 1 comments
Hi!
I'm using the cellphonedb v5, Method2. But according to the results and description about the output file:
"Pvalues fields:
cell_a|cell_b: 1 if interaction is detected as significant, 0 if not."
I only got binary 0 or 1 values in the statistical_analysis_pvalues.txt output file. But I couldn't find any files with exact pvalues. I'm wondering where I should find those exact p values from my output.
Thanks!
I'm using the method2 according to the vignette:
from cellphonedb.src.core.methods import cpdb_statistical_analysis_method
cpdb_results = cpdb_statistical_analysis_method.call(
cpdb_file_path = cpdb_file_path, # mandatory: CellphoneDB database zip file.
meta_file_path = meta_file_path, # mandatory: tsv file defining barcodes to cell label.
counts_file_path = counts_file_path, # mandatory: normalized count matrix.
counts_data = 'hgnc_symbol', # defines the gene annotation in counts matrix.
active_tfs_file_path = active_tf_path, # optional: defines cell types and their active TFs.
microenvs_file_path = microenvs_file_path, # optional (default: None): defines cells per microenvironment.
score_interactions = True, # optional: whether to score interactions or not.
iterations = 1000, # denotes the number of shufflings performed in the analysis.
threshold = 0.1, # defines the min % of cells expressing a gene for this to be employed in the analysis.
threads = 5, # number of threads to use in the analysis.
debug_seed = 42, # debug randome seed. To disable >=0.
result_precision = 3, # Sets the rounding for the mean values in significan_means.
pvalue = 0.05, # P-value threshold to employ for significance.
subsampling = False, # To enable subsampling the data (geometri sketching).
subsampling_log = False, # (mandatory) enable subsampling log1p for non log-transformed data inputs.
subsampling_num_pc = 100, # Number of componets to subsample via geometric skectching (dafault: 100).
subsampling_num_cells = 1000, # Number of cells to subsample (integer) (default: 1/3 of the dataset).
separator = '|', # Sets the string to employ to separate cells in the results dataframes "cellA|CellB".
debug = False, # Saves all intermediate tables employed during the analysis in pkl format.
output_path = out_path, # Path to save results.
output_suffix = None # Replaces the timestamp in the output files by a user defined string in the (default: None).
)
Hi ccl6,
statistical_analysis_pvalues_*.txt file has the exact p-values - many thanks for pointing this out - the comment in https://github.com/ventolab/CellphoneDB/blob/master/notebooks/T1_Method2.ipynb was wrong and I have just corrected it.
Good luck with your analysis with CellphoneDB and thank you again for your feedback.
Best,
Robert.