Which file to use for deseq to analyse tRF differential expression?

Question

Which file to use for deseq to analyse tRF differential expression?

Closed this issue 3 years ago · 5 comments

Hello, sorry for another question but was wondering which file I should use to analyze tRF expression.
I have the tRF.Counts file but it looks like this below:

Entry name N1
Asn_Comb_9 1
Glu_Comb_27 1
Glu_Comb_35 1

And the rest are all 0. The tRF.RP100K looks the same as above but instead of 1 it is 66.36 then the rest are all 0 as well.
Also the tRF.Counts.csv file for my other sample, the tRF counts are all 0. So I was wondering which file to use to analyse differential expression.

There is also another folder called tRFs.samples.tmp and in the folder the file called N1_aligned_tRFs.summary below showed:

amino acid Counts RP100K Unique reads
pre:Tyr tRF-1 1175 77969.476 10
pre:Met tRF-1 53 3516.921 7
pre:Ser tRF-1 10 663.570 6
pre:Cys tRF-1 23 1526.211 7
Glu 5' 0 0.000 0
Glu 3' 1 66.357 1
Glu other 3 199.071 3
Asn 5' 1 66.357 1
Asn 3' 0 0.000 0
Asn other 4 265.428 4
pre:Lys tRF-1 3 199.071 3
pre:His tRF-1 0 0.000 0
Ala 5' 0 0.000 0
Ala 3' 0 0.000 0
Ala other 3 199.071 2
pre:Val tRF-1 5 331.785 4
pre:Thr tRF-1 2 132.714 2
pre:Pro tRF-1 5 331.785 4
pre:Phe tRF-1 1 66.357 1
pre:Leu tRF-1 3 199.071 3
pre:Ile tRF-1 5 331.785 5
pre:Gly tRF-1 5 331.785 5
pre:Arg tRF-1 3 199.071 3
pre:Glu tRF-1 4 265.428 2
pre:iMet tRF-1 4 265.428 3
pre:Gln tRF-1 3 199.071 3
pre:Asp tRF-1 1 66.357 1
pre:Asn tRF-1 1 66.357 1
pre:Ala tRF-1 4 265.428 4
Tyr 5' 1 66.357 1
Tyr 3' 0 0.000 0
Tyr other 0 0.000 0
pre:SeC tRF-1 0 0.000 0

Thank you again for the help

Appreciate it

Answer 1 · 2021-12-12T16:22:38.000Z

If you are going to use it for Deseq analysis, you'd want to use the raw count file (tRF.Counts file) and not the RP100K file, which is normalized to total tRFs. With that said, you seem to have very few tRF reads (1,507 by my count, from what you report) so I would be very cautious interpreting this data.

Answer 2 · 2021-12-13T00:45:16.000Z

Hello, thank you so much for the reply, really appreciate it. Just a bit confused regarding the files.

So from the tRF counts file, I only got 3 counts for tRF and was a bit confused why the count file shows only 3 counts for tRF and when I looked at the aligned tRF.aligned.report.tsv there were 1507 counts.
Also the glu_comb, the comb means combined?

Thank you so much for the help

Answer 3 · 2021-12-13T01:35:48.000Z

@seulalee1008,

Looks like there probably is an issue with counts here. Can you share the files across so I can have a quick look? A sample of FASTQ file if that is possible and adapter sequence used here. (If you would like to send it to my email address, that would be fine as well: arun26feb at gmail dot com)

Thank you,
Arun

Answer 4 · 2021-12-13T04:24:41.000Z

@seulalee1008,

Hi, I checked the files you shared across, here are my observations:

they are paired end reads which is not designed for small RNA sequencing analysis using miRge3.0,
In rare cases I have come across that they do use paired-end to analyze smallRNA, however, the data contains lot of poly-A and the adapter sequence you shared looks like it is 3' adapter and further I checked the counts of the adapter sequence and the reads from the file:

$ zgrep -c "CTGTCTCTTATACACATCT" N1_R1.fastq.gz
72981
$ zgrep -c "CTGTCTCTTATACACATCT" N1_R2.fastq.gz
61661
$ zgrep -c "@NS500799" N1_R1.fastq.gz
71280346
$zgrep -c "@NS500799" N1_R2.fastq.gz
71280346

From the above we can see that majority of the reads don't have the adapter sequence.
I will get back if I find more details on this data.

Thank you,
Arun.

Answer 5 · 2021-12-13T04:47:48.000Z

Dear Arun, Thank you so much for the help. Yes, you are right these are pair-end total RNA seq files. We had some downregulation of tRNA so we wanted to see whether there are tRNA fragments. I tried using different software like tRNAscan etc but that didn't work out so well. Thank you again for the help, really appreciate it Kind regards, Christine (Seul A) Lee, PhD candidate (UNSW) School of Biotechnology and Biomolecular Sciences University of New South Wales, Sydney, NSW 2052, Australia [cid:57aece13-e233-4059-ad51-72efc4df0237]

…

________________________________ From: Arun Patil ***@***.***> Sent: Monday, December 13, 2021 3:24 PM To: mhalushka/miRge3.0 ***@***.***> Cc: Christine Lee ***@***.***>; Mention ***@***.***> Subject: Re: [mhalushka/miRge3.0] Which file to use for deseq to analyse tRF differential expression? (Issue #29) @seulalee1008<https://github.com/seulalee1008>, Hi, I checked the files you shared across, here are my observations: 1. they are paired end reads which is not designed for small RNA sequencing analysis using miRge3.0, 2. In rare cases I have come across that they do use paired-end to analyze smallRNA, however, the data contains lot of poly-A and the adapter sequence you shared looks like it is 3' adapter and further I checked the counts of the adapter sequence and the reads from the file: $ zgrep -c "CTGTCTCTTATACACATCT" N1_R1.fastq.gz 72981 $ zgrep -c "CTGTCTCTTATACACATCT" N1_R2.fastq.gz 61661 $ zgrep -c ***@***.***" N1_R1.fastq.gz 71280346 $zgrep -c ***@***.***" N1_R2.fastq.gz 71280346 From the above we can see that majority of the reads don't have the adapter sequence. I will get back if I find more details on this data. Thank you, Arun. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub<#29 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AWZ55AHU7RZS7ARYKHCIYK3UQVYRHANCNFSM5J35NEXQ>. Triage notifications on the go with GitHub Mobile for iOS<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.