This repo is for a capstone project on the genetic basis of exploratory behaviors in mice.
- Explore genetic basis of exploratory behaviors in two sister species of mice.
- Raw video data of mice exploring elevated plus maze
- Feature Engineering: create new features to split up trials and isolate behaviors of mice
- Process video data and extract results for desired features to create CSV of results for all trials
- Visualize differences in features between pure species with violin plots
- Perform significance testing to extract significant features across different dimensions
- Run quantitative trait loci (QTL) analyses for genetic hybrid mice (F2 crosses) to explore genetic basis of behaviors
Below is an explanation of the directories:
- all_submissions: deliverables for the course including written reports, the poster, and the ethics audit. The project proposal is also provided.
- eda: Scripts to run exploratory data analysis (violin plots to see distributions of features). Resulting plots are provided (
PosterPlots
has final plots used for the poster, etc. as individuals and grids). - features: Scripts to read in the raw data and create CSV of all mice with feature results for each mouse.
process_mice.py
is executed and outputs the CSV to theresults/
folder asall_the_data.csv
.feature_engineering.py
does all of the calculations/work and is imported by theprocess_mice.py
script. Other package dependencies are in theresources/
folder. Additionally,Stat_significance.ipynb
is provided, which used the finished CSV to output the significant features. Notes: the processing script expects the data in a folder calledEPM_data
(not included) in the same working directory. Directories used from the dataset are provided inresults_files_used_for_PO_BW_BWPOF1_BWPOF2_analyses_20181129.txt
(located inresources/
). To only run select inner mouse directories, uncomment lines 39-51 ofprocess_mice.py
. - genetics: Scripts to create quantitative trait loci (QTL) plots. Vanilla scripts (QTL.ipynb and copies; done to reduce hassle with refreshing cache of RData) and cross-feature scripts (
QTL -- Species*Sex.ipynb
) are provided, as well as starter code (EPM qtl analysis new chromosome names 2017.ipynb
). Outputs inresults/
folder. - meeting-summaries: Markdown notes of selected biweekly meetings.