/algorithms-and-data-structures

Video descriptions and minimalist Python implementations of algorithms and data structures.

Primary LanguagePythonOtherNOASSERTION

Introduction to data structures and algorithms

This repo aims to provide minimalist implementations of data structures and algorithms in Python to accompany a lecture course on this topic. The bare necessities. Each data structure is accompanied by a video lecture (and pdf slides).

Data structures, algorithms and concepts

Binary Search Trees

Red-Black Trees

B-trees

Hash Tables

Heapsort and Binary Heaps

Quicksort

Lower Bounds for Comparison Sorts

Counting Sort

Radix Sort

Bucket Sort

Task-Parallel Computing

NP-complete Problems

Foundation models for code

Until recently, code for implementing data structures was largely written by hand. As of 2021, there have been exploratory efforts to employ neural networks for the task of generating code from natural language descriptions (this approach underpins the GitHub Copilot tool, for example). If you'd like to learn more, the video below describes Codex, a popular foundation model for code generation.

Further background

Big O notation (and $\Theta$, $\Omega$, $o$, $\omega$ too)

Row-major order vs column-major order

Why numbering should start at zero (according to Dijkstra)

Further resources

Books

The books below represent (in my opinion) high-quality learning/reference materials.

  • 📙 Introduction to Algorithms (a.k.a "CLRS") by Thomas H. Cormen et al. - this was the primary reference used when developing the materials above. If you have a local library, it's worth checking in case they have a copy.
  • 📗 algorithms.wtf by Jeff Erickson (open-source)
  • 📗 + 🔨 Elementary Algorithms by Liu Xinyu (open-source)
  • 📙 The Art of Computer Programming by Donald E. Knuth. This is a series of books rather than a single book. The content is engaging, humorous and extraordinarily comprehensive. It is perhaps most useful as a reference for deep dives on topics, rather than an introductory learning resource (simply because the level of detail is so high).

Visualisations

Red-black trees visualisation

red-black tree growth gif

B-trees visualisation

btree visualisation

Requirements for install

To visualise the b-trees as done above, you'll need to have a copy of pygraphviz:

conda install -c anaconda graphviz
pip install pygraphviz

If you'd like to run the parallel code without the GIL, you need Sam Gross' nogil version of Python. This is not strictly necessary to run the code, but without it the parallel code with be slower than the serial code.

# first, install pyenv. Then:
pyenv install nogil-3.9.10
# activate the nogil
pyenv local nogil-3.9.10

# run your code here

# (for afterwards) deactivate the nogil (if you want to go back to the system python)
pyenv local system

Additional comments

The repo is a work in progress. It aims to provide a reference for learning, not code for production (it has not been extensively tested to handle all edge cases).

Pull requests are welcome.