bioinformatics_and_data_science_part_II

Bioinformatics and Data Science Part II Spring 2021
BIOL 792-1036
Prof: Julie Allen; SFB 206; julieallen34@gmail.com
Class: Tuesdays and Thursdays 3:00 - 4:15 zoom Link Office Hours: By appointment

Course Description

Online data repositories and individual data sets are growing at unprecedented rates. The need for bioinformatic and data science skills is rapidly growing to match these needs. The main goal of the second part of this two part series is to continue building on the linux and python skills the students learned in the first semester and to add an understanding of data science and tools for managing large datasets. The course will focus on python programming and working in the shell along with introduction to data standards and version control, tools for cleaning dirty data, data visualization and understanding how to work with clusters etc.

With an understanding of how to integrate different data sources we will increase not only the creativity of our science, but also expand our ability to do more broad-scale research. A prerequisite for this course is enrollment as an M.S. or PhD student and have taken Data_Science_For_Biology_I. If you have not taken this course email me - to determine eligibility. The course will be capped at 15 students.

Student Learning Outcomes

The goal of the course is to learn many data science tools/tricks and hacks from a bioinformatics angle. By the end of the course you should feel comfortable with the tools data scientists use in Biology and be able to solve and/or trouble shoot both small and large-scale data challenges in biology.

Material Distribution

All readings, lab instructions, datasets, etc. will be available here.

Attendance and Participation

Because this is a graduate class, I expect full attendance and participation, including all in class exercises, homework, and projects.

Grade

Homework assignments (40%) Assignments will involve working in Unix, writing simple Python scripts, and other small assignments given during each module. These will be working with data sets that will be provided over the course of the semester. Assignments will be evaluated based on completion. You can work in teams of 2 or 3 but will turn in your own notes and scripts for each assignment. More guidelines on these files and each specific assignment will be available on github.

Participation (20%) Participation entails showing up for class, prepared and doing your best to work through assigned tasks and programming example problems. Becasue all classes build on previous classes if you need to miss a class contact me. Some of the material we cover might be easy and quick to figure out. Other material and tasks will present roadblocks that are more difficult. We are building a positive community in this class, your attitude and helpfulness will be evaluated.

Independent project (40%) Everyone will be responsible for an independent project (this can be done either individually, or as a group no more than 3 people). The goal of your semester project is to incorporate the tools learned in this classroom into a project of your design. Ideally this will be something related to your research and will help you move your PhD forward, but you could decide to work on new project. A requirement of the project will be to incorporate at least 2 tools learned in the class to resolve a biological question or computational problem. You will turn in a one to two page write up of the project and how you will solve it by week 6. On the last day of class you will turn in a one to three page write up of the project, put the documented code on github, (or submit to me) and present your project in a 10-15 min presentation the last day of class.

White paper

  • 1-2 page White Paper: The 1-2 page write up should be similar format to a whtie paper. Therefore there should be an introduction to the biological or other type of problem you are trying to solve (with references), just like a white paper. The next section will be there methods. Here describe what you are going to do. For example "I will write a python script to take the data from a phyllip format to a fasta format". There should be two techniques from the class used (e.g. python, shell scripts, Github, Relational Database, Cleaning Data).

Project Summary + Presentation

  • 1-2 page Project Paper: The 1-2 page final paper should be similar format to the whiite paper but added results and discussion section. Again explain in detail what aare the two tools from the class you used and how thata turned oout. In the diiscussion talk about how this helped your project and what you would do next (or what you leaarned).

  • 10 - 15 min presentation: On the last day of class each of you will present your project to the class. No more than 15 min each - Feel free to show GitHub repos anad or run code in class.

SCHEDULE

*this is the tentative outline of the schedule – the events may change according to the speed and needs of the students in the course the course is going to be set up into 5 parts

Part I - Unix - Version Control, Git, Github

Part II - Python - Pandas, Notebooks

Part III - Data Visualization

Part IV - Data Cleaning, Relational Databases

Part V - Working with Clusters

Week Month Date Class Due
Week 1 Jan 26 Course intro
Week 1 Jan 28 Unix Refresh (shell)
Week 2 Feb 2 Part I. Version control with Git, Creating a Repository, Tracking Changes
Week 2 Feb 4 Tracking Changes, work on Homework 1
Week 3 Feb 9 Exploring History, Gitignore Homework_1.Linux_Refresh
Week 3 Feb 11 Remotes in Github, Practice
Week 4 Feb 16 Collaborating, Conflicts
Week 4 Feb Th 18 Git Conflicts Git Wrap-up - Homework 2 Homework_2.Github
Week 5 Feb Tu 23 Part II. Intro to Programming - Jupyter Notebooks HW 3
Week 5 Feb Th RD ------------READING DAY ------------------
Week 6 Mar Tu 2 Homework 3
Week 6 Mar Th 4 Pandas -- Faske (python + pandas) Homework_3.Python_Refresh
Week 7 Mar Tu 9 RD ------------READING DAY ------------------
Week 7 Mar Th 10 Pandas -- Faske (python + pandas)
Week 8 Mar 16 Project Write Up/Homework Work day Homework_4.Pandas/*1-2 Page Project Writeup Due
Week 8 Mar 18 PartIII.Data Visualization ggplot2 Faske
Week 9 Mar 23 Data Visualization ggplot2 -- Faske (R)
Week 9 Mar 25 Data Visualization -- Jahner Homework_5.DV ggplot
Week 10 Mar 30 Data Visualization -- Jahner
Week 10 Apr 1 Part IV.Data Science + Open Refine Homework_6.DV
Week 11 Apr 6 Data Science + Open Refine (open refine)
Week 11 Apr 8 Relational Databases Homework_7.OR
Week 12 Apr 13 Sqlite [Pronghorn Open]
Week 12 Apr 15 Part V.Clusters - Sebastian Smith
Week 13 Apr 20 Clusters - Sebastian Smith
Week 13 Apr 22 Clusters - Sebastian Smith
Week 14 Apr 27 Project Prep Homework_8.clusters
Week 14 Apr 29 Project Prep
Week 15 May 4 Project presentations *presentations due
Week 16 May 10 Whitepapers and any remaining homework is due

** all homework is due by Tuesday May 11th

Statement on Academic Dishonesty:

"Cheating, plagiarism or otherwise obtaining grades under false pretenses constitute academic dishonesty according to the code of this university. Academic dishonesty will not be tolerated and penalties can include canceling a student's enrollment without a grade, giving an F for the course or for the assignment. For more details, see the University of Nevada, Reno General Catalog."

Statement of Disability Services:

Statement of Disability Services For Traditional and Seated Classrooms: “Any student with a disability needing academic adjustments or accommodations is requested to speak with me or the Disability Resource Center (Pennington Achievement Center Suite 230) as soon as possible to arrange for appropriate accommodations.”

For Online Courses:

“If you are a student who would normally seek accommodations in a traditional classroom, please contact me as soon as possible. You may also contact the Disability Resource Center for services for online courses by emailing drc@unr.edu or calling 775-784-6000. Academic accommodations for online courses may be different than those for seated classrooms; it is important that you contact us as soon as possible to discuss services. The University of Nevada, Reno supports equal access for students with disabilities. For more information, visit the Disability Resource Center.” This course may leverage 3rd party web/multimedia content, if you experience any issues accessing this content, please notify your instructor

Statement on Audio and Video Recording:

"Surreptitious or covert video-taping of class or unauthorized audio recording of class is prohibited by law and by Board of Regents policy. This class may be videotaped or audio recorded only with the written permission of the instructor. In order to accommodate students with disabilities, some students may be given permission to record class lectures and discussions. Therefore, students should understand that their comments during class may be recorded."

Statement on Maintaining a Safe Learning and Work Environment

The University of Nevada, Reno is committed to providing a safe learning and work environment for all. If you believe you have experienced discrimination, sexual harassment, sexual assault, domestic/dating violence, or stalking, whether on or off campus, or need information related to immigration concerns, please contact the University's Equal Opportunity & Title IX office at 775-784-1547. Resources and interim measures are available to assist you. For more information, please visit the

Equal Opportunity and Title IX page.

Statement on Academic Success Services Your student fees cover usage of the Math Center (775) 784-4433, Tutoring Center (775) 784-6801, and University Writing Center (775) 784-6030. These centers support your classroom learning; it is your responsibility to take advantage of their services. Keep in mind that seeking help outside of class is the sign of a responsible and successful student.

UNIVERSITY POLICIES:

Statement on COVID-19 policies

Training

Students must complete and follow all guidelines as stated in the Student COVID-19 Training modules, or any other trainings or directives provided by the University.

Face Coverings

In response to COVID-19, and in alignment with State of Nevada Governor Executive Orders, Roadmap to Recovery for Nevada plans, Nevada System of Higher Education directives, the University of Nevada President directives, and local, state, and national health official guidelines face coverings are required at all times while on campus, except when alone in a private office. This includes the classroom, laboratory, studio, creative space, or any type of in-person instructional activity, and public spaces. A “face covering” is defined as a “covering that fully covers a person’s nose and mouth, including without limitation, cloth face mask, surgical mask, towels, scarves, and bandanas” (State of Nevada Emergency Directive 024). Students that cannot wear a face covering due to a medical condition or disability, or who are unable to remove a mask without assistance may seek an accommodation through the Disability Resource Center.

Social Distancing

Face coverings are not a substitute for social distancing. Students shall observe current social distancing guidelines where possible in accordance with the Phase we are in while in the classroom, laboratory, studio, creative space (hereafter referred to as instructional space) setting and in public spaces. Students should avoid congregating around instructional space entrances before or after class sessions. If the instructional space has designated entrance and exit doors students are required to use them. Students should exit the instructional space immediately after the end of instruction to help ensure social distancing and allow for the persons attending the next scheduled class session to enter.

Disinfecting Your Learning Space

Disinfecting supplies are provided for you to disinfect your learning space. You may also use your own disinfecting supplies.

COVID-19, COVID-19 Like Symptoms, and Contact with Someone Testing Positive for COVID-19

Students must conduct daily health checks in accordance with CDC guidelines. Students testing positive for COVID 19, exhibiting COVID 19 symptoms or who have been in direct contact with someone testing positive for COVID 19 will not be allowed to attend in-person instructional activities and must leave the venue immediately. Students should contact the Student Health Center or their health care provider to receive care and who can provide the latest direction on quarantine and self-isolation. Contact your instructor immediately to make instructional and learning arrangements.

Laboratory, Studio, and Creative Space Settings

You will be provided specific instructions and procedures by your instructor for art studios, recording studios, digital media labs, testing centers, observation labs, podcasting studios, dance studios, clinical centers, research labs, physical science labs, etc. as necessary.

Failure to Comply with Policy (including as outlined in this Syllabus) or Directives of a University Employee

In accordance with section 6,502 of the University Administrative Manual, a student may receive academic and disciplinary sanctions for failure to comply with policy, including this syllabus, for failure to comply with the directions of a University Official, for disruptive behavior in the classroom, or any other prohibited action. “Disruptive behavior" is defined in part as behavior, including but not limited to failure to follow course, laboratory or safety rules, or endangering the health of others. A student may be dropped from class at any time for misconduct or disruptive behavior in the classroom upon recommendation of the instructor and with approval of the college dean. A student may also receive disciplinary sanctions through the Office of Student Conduct for misconduct or disruptive behavior, including endangering the health of others, in the classroom. The student shall not receive a refund for course fees or tuition.