/CompThinkBioinf

Computational Thinking in Bioinformatics Course - Aarhus University

Primary LanguageJupyter Notebook

Computational Thinking in Bioinformatics

Lecture notes and material for the class Computational Thinking in Bioinformatics at the Bioinformatics Research Centre, Aarhus University.

Week 1: Computational thinking

Lecture:

  • Problem domain / models / goals and problems / solutions / implementations

Exercises:

  • Introduction to Python programming and Jupyter notebooks.
  • Branching and looping (JVG 2 + 3)

Besides the lectures and exercises, we could recommend a basic online course in Python programming that could be followed and maybe have some of the exercises in relation to this.

Week 2: Algorithm analysis

Lectures:

  • Correctness and complexity
  • Pre- and post-conditions and invariants -- exemplified by e.g. different sorting algorithms
  • O-notation and reasoning about complexity

Exercises:

  • Functions (JVG 4)
  • Basic data structures (JVG 5)

Week 3: Implementing algorithms

Lectures:

  • Translating ideas into code by breaking them down to the right level of abstraction
  • Running time in practise

Exercises:

  • Testing and debugging (JVG 6)
  • Plotting (JVG 11.1)
  • Complexity (JVG 9)

Week 4: Algorithmic design

Lectures:

  • Pre- and post-conditions and invariants revisited
  • Recursion
  • Divide and conquer
  • Dynamic programming

Exercises:

  • Dynamic programming examples (JVG 13)

Week 5: Data structures and trees

Lectures:

  • Representing trees as lists
  • Traversing trees, e.g. DFS for outputting a tree

Exercises:

  • Implementing a tree representation
  • Potentially a Newick parser
  • DFS traversal and outputting a tree

Week 6: Data structures and graphs

Lectures:

  • Representing graphs as incidence lists or matrices

Exercises:

  • Classical graph algorithms (JVG 12.2)

Week 7: Complexity of problems

Lectures:

  • Problem complexity versus solution (algorithm) complexity
  • NP-hardness

Exercises:

???

Week 8: Guided project; building a bioinformatics tool;

Lectures:

  • gene finding,
  • sequence data,
  • FASTA format

Exercises:

  • Parsing a FASTA file

Week 9: Guided project; methodology and algorithms

Lectures:

  • HMMs, automata, languages

Exercises:

  • Representing an HMM
  • Computing the likelihood of a HMM

Week 10: Guided project; building a bioinformatics tool; implementation

Lectures:

  • machine architecture
  • representation of numbers

Exercises:

  • Log-space and scaling likelihood computations

Week 11: Guided project; building a bioinformatics tool; training (optimization)

Lectures:

  • Training by counting (maximum likelihood estimation for an HMM)

Exercises:

Training by counting

Week 12: Guided project; experimental validation

Lectures:

  • How to statistically validate correctness
  • How to (statistically) validate running time

Exercises:

  • Validation of implementation

Week 13: Other programming languages and systems

Lectures:

  • Direct translation of simple python code to C

Exercises:

  • Re-implement likelihood computations i C
  • Compare running time

Week 14: Other programming languages and systems II

Lectures:

  • HMMs as matrix-vector operations

Exercises:

  • Implement in numpy as matrix vector operations
  • Compare running time