mirna_mite: A Jupyter Notebook repository from juancresc

This repo is divided into steps:

1_data_ss
- 1_mrcv_ss.sh Will run shortstack in "Mal de Rio Cuarto" smallRNA data https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5423983/
- 2_sun_get_data.sh Will download smallRNA data from Sun et al. https://bmcplantbiol.biomedcentral.com/articles/10.1186/1471-2229-14-142
- 3_sun_filter.sh Will filter sun adapters
- 4_sun_ss.sh and 5_sun_ss_separated.sh will run shortstack with Sun et al. data. Only results from 4 will be used in further analysis.
- 6_transcript_name_fixer.py will remove descriptions from wheat cDNA because of a Cleaveland requirement
- 7_sun_deg_filter.sh Will filter Sun et al. degradome data
- 8_sun_cleave.sh Run cleaveland with Sun et al. data
1_mites
- create_unique_consensous.ipynb Takes a fasta file with MITEs sequences and creates one consensous file using vsearch
2_analysis
- 1_analysis.ipynb Will do the main analysis and counting in miRNA production sites using Shortstack outputs, MITEs blast and genomic annotation
- 2_cleavage.ipynb Uses Cleaveland and psRNATargetFinder outputs to analyse cleavage of targets data
Other scripts used in this pipeline:

BLAST MITEs against wheat genome

blastn -task blastn -query <mites_files.fasta> \
-subject Triticum_aestivum.IWGSC.dna.toplevel.fa \
-outfmt  "6 qseqid sseqid qstart qend \ 
sstart send mismatch gaps pident evalue length \ 
qlen slen qcovs" -evalue 1e-10 > <mites_results.csv>

Trim smallRNA libraries (sun et al.)

trim_galore <lib.fastq.gz> -o <lib.trimmed.fastq.gz> \
--adapter TCGTATGCCGTCTTCTGCTTG --max_length 30 --length 18

Shortstack smallRNA library / wheat genome

ShortStack --readfile <lib.trimmed.fastq.gz> ... \
--genomefile Triticum_aestivum.IWGSC.dna.toplevel.fa \
--sort_mem 9G --foldsize 1000 --sort_mem 9G

DEGs

Get DEGs C vs T Get miRNA DAT C vs T

juancresc/mirna_mite

DEGs