This is a repository includes some class datamining projects which I finished together with my girlfriend pengdandan, a graduate bioinformatics student in ETHZ. As a student whose undergraduate major is Plant Protection, she does not have a competence to write robust code. Neither do I, as a economics student with little background in programming language. With the intention to help herself finished homework faster and also help me to grasp some basic algorithms, she finished these homework with partnership with me. Our partnership model is: mostly, we tried to finish separately, and then make a peer review of each other's job and make debugging and optimization job together if things went well. If we encouter problems, then we will make deeper discuss. So below are a list of her homework projects with my active participation, those include:

  • Distance functions on vectors
  • Needleman-Wunsch and Smith-Waterman Algorithms
  • Compute Dynamic Time Warping distances between time-series
  • Use Floyd-Warshall’s algorithm to compute shortest path lengths
  • Use Shortest Path kernel to compute distance between graphs
  • A classifier class use k-NN algorithm
  • Logistic Regression use sklearn.linear_model
  • Decision_trees
  • Principal Component Analysis
  • Self-organizing maps