All slides presented are in week_1_slides.pdf
.
Neural Network Implementation
- Raw DNA to Tensors, What are our options?
- One Hot encoding (2D)
- What layers make sense to include?
- Incorporate more data (mapped TFBS, DNAseI, species relatedness, ect)
Quality Control Pipeline
- Quantify and analyze the input data
- Are these sequences valid?
- Which sequences do we remove?
- Visualize
TFBS mapping
- Map all TFBS onto sequence
- Visualize TFBS and conservation
- Make tools for community
- Email your team preference
- Give me your github username
- In the folder
fasta_exercise
. Make a program that reads in thepdm2_neurogenic.fa
and performs one or all of the tasks mentioned below. You can perform in R or Python. When you have a working script, please up load into thefasta_exercise
directory. Please fork repo and make pull request to add.
Program Tasks:
- Make a sequence alignment
- turns sequences into a basic python / R data structure
- measures GC content per sequence