AssertionError in conga.plotting.make_graph_vs_graph_logos()
cr2106 opened this issue · 1 comments
cr2106 commented
Hi!
I'm trying to run CoNGA on some B cell data. I get the AssertionError
when I run conga.plotting.make_graph_vs_graph_logos
:
~/conga/conga/plotting.py in make_logo_plots(adata, nbrs_gex, nbrs_tcr, min_cluster_size, logo_pngfile, logo_genes, gene_logo_width, clusters_gex, clusters_tcr, clusters_gex_names, clusters_tcr_names, ignore_tcr_cluster_colors, show_real_clusters_gex, good_bicluster_tcr_scores, rank_genes_uns_tag, include_alphadist_in_tcr_feature_logos, max_expn_for_gene_logo, show_pmhc_info_in_logos, nocleanup, conga_scores, conga_scores_name, good_score_mask, make_batch_bars, batch_keys, make_cluster_gex_logos, draw_edges_between_conga_hits, add_conga_scores_colorbar, add_gex_logos_colorbar, pretty, gex_header_genes, make_gex_header, make_gex_header_raw, make_gex_header_nbrZ, gex_header_tcr_score_names, include_full_tcr_cluster_names_in_logo_lines, lit_matches)
1336 # old_make_tcr_logo( [ tcrs[x] for x in nodes ], ab, organism, pngfile )
1337 # else: # new way
-> 1338 make_tcr_logo_for_tcrs( [ tcrs[x] for x in nodes ], ab, organism, pngfile,
1339 tcrdist_calculator=tcrdist_calculator )
1340 image = mpimg.imread(pngfile)
~/conga/conga/tcrdist/make_tcr_logo.py in make_tcr_logo_for_tcrs(tcrs, chain, organism, pngfile, tcrdist_calculator)
504 for ivj, vj in enumerate('vj'):
505 gene = tcr[iab][ivj]
--> 506 assert gene in all_genes.all_genes[organism]
507 for tag in 'gene genes rep reps'.split():
508 info[f'{vj}{ab}_{tag}'] = gene
AssertionError:
The logos plot is only produced until the TCR logo.
adata.uns['conga_stats']
OrderedDict([('num_cells_w_gex', 191798),
('num_features_start', 21587),
('num_cells_w_tcr', 8849),
('min_genes_per_cell', 200),
('max_genes_per_cell', 2500),
('max_percent_mito', 0.1714),
('num_filt_max_genes_per_cell', 388),
('num_filt_max_percent_mito', 490),
('num_antibody_features', 0),
('num_TR_genes', 85),
('num_TR_genes_in_hvg_set', 33),
('num_highly_variable_genes', 976),
('num_cells_after_filtering', 7968),
('num_clonotypes', 7846),
('max_clonotype_size', 8),
('num_singleton_clonotypes', 7744)])
Thanks a lot for your help and great package!
phbradley commented
Hi there,
Thanks for trying conga! It looks like there's an "unrecognized" TCR gene name. That could mean that there's a mismatch between the "TCR" (or BCR) type and the organism. Can you double-check that the organism being passed into the plotting function is "human_ig"? Are you running this from the notebook or the command line?
You could investigate with a snippet of code like this:
organism = 'human_ig'
tcrs = conga.preprocess.retrieve_tcrs_from_adata(adata)
for tcr in tcrs:
print(tcr)
va, ja = tcr[0][:2]
vb, jb = tcr[1][:2]
assert va in conga.tcrdist.all_genes.all_genes[organism]
assert ja in conga.tcrdist.all_genes.all_genes[organism]
assert vb in conga.tcrdist.all_genes.all_genes[organism]
assert jb in conga.tcrdist.all_genes.all_genes[organism]