HCGB-IGTP/XICRA

miRNA module not finding joined reads

Closed this issue · 2 comments

Hello, I am trying to use the miRNA module following prep, trim, and join (starting with SE reads) as

XICRA miRNA \
    --input Prep/ \
    --output_folder Prep/ \
    --species eca \
    --database miRBase \
    --software optimir \
    --single_end

everything up to this point worked as expected (NOTE - fastq-join had to be installed separately and was not included following the conda-based installation). Upon executing the above code, I get the following error for each of the included test samples (n=2)

** ERROR: Only 1 fastq file is allowed please joined reads before...

When I run with --debug I can see that the table being built are indeed pulling both R1/R2 instead of the trimmed and joined FASTQs which exist as expected in Prep/data/SAMPLE/join/SAMPLE_trim_joined.fastq.

I'm sure I'm just missing something but please let me know if there are any other logs you would like to see.

Hi Jonah,
Thanks for pointing out about the fastq-join missing dependency in the conda environment. I also found featureCounts was missing in a completely fresh installation so be aware of that too. I have recently updated both the conda environment file and the python code, so you might find useful to re-install (at least the python code, make sure version >v.1.4.5 by typing pip install XICRA)

Regarding the question you mentioned and the error you encountered using your data I suggest to remove some specs from you command: --output and --single-end

I guess you should have generated a project folder containing the reads, qc results, and trimmed reads right? So you only require to use --input flag as output results would be generated too there. Regarding the --single-end flag, if you have already joined reads, you should not specify this flag anymore.

Your command would be

XICRA miRNA \
    --input Prep/ \
    --species eca \
    --database miRBase \
    --software optimir \

I have to say I haven't tested in any other species than human but it should work fine. Please let me know if it not works proprely.

Also, I prefer to use miraligner than optimir, although both are implemented. We found better performance using simulated reads in the XICRA paper (https://doi.org/10.1186/s12859-021-04128-1)

I have recently updated the code and I have added an example. Please run XICRA test to get a set of reads to test the pipeline. Read the commands included in the test_subset.sh generated and find the appropiate steps to reproduce the results.

Please, do not hesitate to contact me for further details.

Best regards,
Jose

Removing --single-end worked, I must have misunderstood the help prompt. Thank you again!