huangyh09/brie

Brie-count error with paired end smart-seq2 data

Stephen1202-Wang opened this issue · 1 comments

Hi Yuanhua,
Thanks for developing this amazing tool! I'm interested in detecting AS events in my paired-end smart-seq2 dataset and when I was running the brie-count function:

brie-count -S cell_table_smartseq.tsv -a mouse_SE.lenient_50events.gff3
-o outs_smartseq -p 10 #--verbose

An error occurred:
[BRIE2] loading gene annotations ... Done.
[BRIE2] counting reads for 50 genes in 1 sam files with 10 cores...
[BRIE2] [====================] 100.0% cells done in 0.1 sec.
[BRIE2] 50 genes have been processed.
[BRIE2] saving matrix into h5ad ... Traceback (most recent call last):
File "/home/guanao/.local/bin/brie-count", line 33, in
sys.exit(load_entry_point('brie==2.2.2', 'console_scripts', 'brie-count')())
File "/home/guanao/.local/lib/python3.6/site-packages/brie/bin/count.py", line 304, in main
options.nproc, options.event_type, options.verbose)
File "/home/guanao/.local/lib/python3.6/site-packages/brie/bin/count.py", line 123, in smartseq_count
gene_note=np.array(gene_table, dtype='str'))
File "/home/guanao/.local/lib/python3.6/site-packages/brie/utils/io_utils.py", line 23, in convert_to_annData
_shape = Rmat[_input_keys[0]].shape
IndexError: list index out of range

I found in the example dataset (msEAE), single-end smart-seq data were used and I succeeded in running the brie-count function in the same data preprocessing manner but changed paired-end smart-seq2 data into single-end smart-seq2 data.

It seems that brie-count works well for single-end smart-seq2 data but not paired-end smart-seq2 data. Any suggestions would be appreciated. Thank you so much!

Thanks for raising the issue. It seems the issue happens to convert the matrix into annData and the likely reason is that the output is empty. You can check the .mtx file and see if it is indeed empty. If true, then the bigger issue is why it is empty, likely not due to paired-end reads.

Another comment is that you can put all your cells into the cell_table_smartseq.tsv (one cell per line), so there are better chance to avoid empty outputs.

Yuanhua