/ctib

Lecture notes and material for Computational Thinking in Bioinformatics

Primary LanguageJupyter Notebook

Computational Thinking in Bioinformatics

Lecture notes and material for the class Computational Thinking in Bioinformatics at the Bioinformatics Research Centre, Aarhus University.

Week 1: Computational thinking

Lecture:

  • Problem domain / models / goals and problems / solutions / implementations

Exercises:

  • Introduction to Python programming and Jupyter notebooks.
  • Branching and looping (JVG 2 + 3)

Besides the lectures and exercises, we could recommend a basic online course in Python programming that could be followed and maybe have some of the exercises in relation to this.

Week 2: Algorithm analysis

Lectures:

  • Correctness and complexity
  • Pre- and post-conditions and invariants -- exemplified by e.g. different sorting algorithms
  • O-notation and reasoning about complexity

Exercises:

  • Functions (JVG 4)
  • Basic data structures (JVG 5)

Week 3: Implementing algorithms

Lectures:

  • Translating ideas into code by breaking them down to the right level of abstraction
  • Running time in practise

Exercises:

  • Testing and debugging (JVG 6)
  • Plotting (JVG 11.1)
  • Complexity (JVG 9)

Week 4: Algorithmic design

Lectures:

  • Pre- and post-conditions and invariants revisited
  • Recursion
  • Divide and conquer
  • Dynamic programming

Exercises:

  • Dynamic programming examples (JVG 13)

Week 5: Data structures and trees

Lectures:

  • Representing trees as lists
  • Traversing trees, e.g. DFS for outputting a tree

Exercises:

  • Implementing a tree representation
  • Potentially a Newick parser
  • DFS traversal and outputting a tree

Week 6: Data structures and graphs

Lectures:

  • Representing graphs as incidence lists or matrices

Exercises:

  • Classical graph algorithms (JVG 12.2)

Week 7: Complexity of problems

Lectures:

  • Problem complexity versus solution (algorithm) complexity
  • NP-hardness

Exercises:

???

Week 8: Guided project; building a bioinformatics tool;

Lectures:

  • gene finding,
  • sequence data,
  • FASTA format

Exercises:

  • Parsing a FASTA file

Week 9: Guided project; methodology and algorithms

Lectures:

  • HMMs, automata, languages

Exercises:

  • Representing an HMM
  • Computing the likelihood of a HMM

Week 10: Guided project; building a bioinformatics tool; implementation

Lectures:

  • machine architecture
  • representation of numbers

Exercises:

  • Log-space and scaling likelihood computations

Week 11: Guided project; building a bioinformatics tool; training (optimization)

Lectures:

  • Training by counting (maximum likelihood estimation for an HMM)

Exercises:

Training by counting

Week 12: Guided project; experimental validation

Lectures:

  • How to statistically validate correctness
  • How to (statistically) validate running time

Exercises:

  • Validation of implementation

Week 13: Other programming languages and systems

Lectures:

  • Direct translation of simple python code to C

Exercises:

  • Re-implement likelihood computations i C
  • Compare running time

Week 14: Other programming languages and systems II

Lectures:

  • HMMs as matrix-vector operations

Exercises:

  • Implement in numpy as matrix vector operations
  • Compare running time