This repo is for details and stuff related to Russell Lab Bioinfo Basics Activity. The purpose of this activity is to provide an introduction to genomics using a cluster and various other skills/concepts.
We have sequenced some Drosophila simulans flies we believe are infected with the wRi strain of Wolbachia. The goal of the activity is to determine if any, how many reads in each sample map to the Drosophila simulans reference genome, and how many mapped to the wRi genome.
Reference genomes are available from NCBI RefSeq.
Drosophila simulans: GCF_016746395.2
wRi: GCF_000022285.1
Sequencing reads can be found on the Hummingbird and Phoenix clusters. These files should be fully accessible by anyone, so no need to copy them to your working directory.
There are 9 samples:
- Dsimulans_wRi-Riv84
- WT-DsimwRi-line5A
- WT-DsimwRi-line5B
- WT-DsimwRi-line6A
- WT-DsimwRi-line6B
- WT-DsimwRi-line7A
- WT-DsimwRi-line7B
- WT-DsimwRi-line8A
- WT-DsimwRi-line8B
Hummingbird path: /hb/home/cmirchan/fly-data/
Phoenix path: /private/groups/russelllab/cade/bioinfo-basics/fly-data
What SLURM partition should I use?
Hummingbird: 128x24
Phoenix: short
or medium
Add this to SLURM script to have access to your conda env:
source ~/.bashrc
conda activate <your conda env name>
I'm stuck or not sure where to start?
- Narrow down what you are stuck or don't know
- Try googling or asking ChatGPT
- Ask someone in person, describe to them what you have tried
- Look at the solutions