/BIFX552

BIFX 552, Fall 2017

Primary LanguageR

Welcome to BIFX 552

Syllabus

Course Description

This class provides an introduction to applied data science skills needed by bioinformatics professionals. A focus will be placed on reproducible bioinformatics research and will include the following topics and tools: beginning to intermediate use of the Unix command line, working with remote computing resources, version tracking, R and Bioconductor, tools for manipulating sequence data, and creation of pipelines.

  • Instructor: Randall Johnson, PhD
  • Office Hours: In-person office hours will be held Thursdays immediately after class, and online office hours will be held Monday evenings from 8:30 to 9:30 PM. During online office hours, the Blackboard discussion thread titled "Office Hours" will be actively monitored.
  • Prerequisites: BIFX 503
  • Textbook: Bioinformatics Data Skills: Reproducible and Robust Research with Open Source Tools 1st Edition, by Vince Buffalo, O'Reilly Media (2015)
  • Communications: All course communications will be posted on Blackboard. In order to receive timely notifications, it is recommended that you do one or more of the following:
    • Check Blackboard often
    • Set your Blackboard email notifications to alert you when something is posted
    • Download the phone app and enable push notifications (this may not be the best option this term, as the app was just released and seems to be a little limited).

Learning Objectives

On completion of this course, students should be comfortable with the following:

  • Use of the Unix command line to manipulate data and perform bioinformatic analysis tasks
  • Logging into and using remote computing resources
  • Working with version controlled code repositories in a collaborative work environment
  • Use of R and Bioconductor to perform bioinformatic analysis tasks
  • Stitching a series of commands and/or programs together into a reusable pipeline

Homework

In addition to weekly reading assignments, Blackboard modules containing instructional vignettes will need to be viewed. These modules will be followed by a short quiz to guage class understanding prior to class. Students will be given a score for each quiz, but only participation will be tracked for the purpose of grading (i.e. if you complete both the module and the quiz, full points will be awarded for grading purposes).

Grading

Grades will be based on completion of homework, in-class participation, and two exams.

  • Homework - 30%
  • In-class participation - 30%
  • Mid-term - 20%
  • Final exam - 20%

Weather

In the event of severe weather resulting in the closure of Hood College and the cancellation of a regularly scheduled class, the material from the missed class will be posted on blackboard, and at least two live chat sessions will be held to work through material and answer questions.

Tentative Schedule

Reading assignments are from Buffalo's Bioinformatics Data Skills unless otherwise specified, and they should be read prior to class. More details on reading assignments will be given on Blackboard.

Week Topics Reading
1 Aug 24 Class intro
Unix command line
2 Aug 31 Intro to R Ch 8 selections
3 Sep 7 R Scripting
flow control
Ch 8 selections
4 Sep 14 Advanced R topics
5 Sep 21 Project organization
Git
Ch 2
Ch 5 selections
6 Sep 28 Markdown
Advanced Git
Ch 5 selections
7 Oct 6 Advanced Unix tricks Ch 7
8 Oct 12 Mid Term Exam
9 Oct 19 Bioinformatics data Ch 6
10 Oct 26 Genomic Ranges Ch 9
11 Nov 2 FASTA and FASTQ data Ch 10
12 Nov 9 Sequence alignment Ch 11
13 Nov 16 Shell scripting Ch 12
Nov 23 Thanksgiving!
14 Nov 30 Pipelining with Snakemake
15 Dec 7 Containers
Review
16 Dec 14 Final Exam