ossu/data-science

RFC: Add python alternative for algorithms course

waciumawanjohi opened this issue ยท 11 comments

Problem:
Our curricular guidelines do not require learning multiple languages, but our curriculum asks students to learn a language just to take the algorithms classes.

Duration:
2021 Aug 15

Background:
Our curricular guidelines makes only a few references to programming languages. Students are expected to know SQL, the "language" of math, and "a suitable high-level language". (emphasis mine) The introductory courses, and most courses, use Python as this high level language.
But students are directed to Robert Sedgewick's Algorithms course, which is taught in Java.

Students have asked for a python alternative in issues and in the discord (example).

A possible option is the free interactive textbook Problem Solving with Algorithms and Data Structures using Python. The book links to a set of supporting lectures. It also has some exercises in the text, which are paired with youtube video solutions. Each chapter ends with a set of exercises; it does not seem that there is an official solution set but some student solutions can be found on github.

This free textbook is used by dozens of college courses. It is well rated by goodreads and by pythonbooks (which is really a measure of popularity on Amazon).

It is not clear that the book is of the same quality as the Sedgwick course. For one, the Sedgewick course provides an autograder. For another, user ratings of the Sedgewick Algorithms book are notably higher.

Proposal:
Offer Problem Solving with Algorithms and Data Structures using Python as an alternative Algorithms course for students who want to study in Python.

Alternatives:

  1. Stick with the status quo.
  2. Replace the Sedgewick course rather than offer an alternative.

There appear to be a few implementations of Sedgewick's Algorithms in Python:

https://github.com/ChangeMyUsername/algorithms-sedgewick-python

https://github.com/shellfly/algs4-py (this one is also available as a wheel on PyPI)

And at least two more which primarily use Python 2.7, which I will not include here.

Additionally, Sedgewick co-authored an Introduction to Programming in Python book, which includes a section in Algorithms and Data Structures (coverage of which is surprisingly deep for a CS1 course): https://introcs.cs.princeton.edu/python/40algorithms/

Therefore, I think one viable option is to continue recommending the book and course currently being suggested, and add this supplemental information to allow for a Python re-tread if students are interested.

one viable option is to continue recommending the book and course currently being suggested, and add this supplemental information to allow for a Python re-tread if students are interested.

I'm not sure that this addresses the problem. The goal is to allow students to learn the algorithms content without the need to learn an additional programming language. The approach of adding these supplemental materials would better address the problem of 'students state that they have trouble translating what they have learned in one language into the work they do in another language'.

How about https://www.coursera.org/learn/computational-thinking-problem-solving ? I haven't compared it to the current course myself but it does appear to teach algorithms and how to implement them. It expects Python as a language.

I appreciate this RFC as I'm looking for an algorithms course and don't want to have to learn Java right now since I'm currently learning C# and recently learned some Python. I'll likely try the alternative course and at a glance think it's a good option to offer.

My initial perusal of options found this track to be of interest because it appears to accept solutions in a variety of languages (including Python), though I don't know if it's comprehensive enough and course 4 has weak reviews: https://www.coursera.org/specializations/data-structures-algorithms#courses

MIT 6.006 was released for Spring 2020 which features Algorithms in Python 3. This seems like it could be an okay fix, except one of the two prerequisite courses 6.042J would need to be added. Adding 6.042J seems like it's counter to the curricular guidelines of "an efficient data science major should present these mathematical concepts in two courses, in the context of modeling for data-driven problems" (section 2.5). This would add another math course to the 3-4 we already have listed. I'm not sure which has more merit, learning Java to take an algorithms course or learning math to take an algorithms course.

Also, while the curricular guidelines do not require students to learn multiple languages. The appendix to the curricular guidelines, or the envisioned course structure based on the guidelines, actually states "Learn a second programming language (e.g., Python, C++, Java)" as a learning of the Algorithms and Software Concepts course. They list Python here as a second programming language because R is recommended for the Intro to DS courses.

The Curriculum Guidelines describe "algorithm design" and "programming concepts and data structures":

  • Algorithm design: Students must develop the skill set to understand the problem, break it into manageable pieces, assess alternative problem solving strategies, and arrive at an algorithm that efficiently solves the problem.
  • Programming concepts and data structures: Students should have the knowledge to implement their algorithms using procedural and functional programming techniques and their associated data structures, including lists, vectors, data frames, dictionaries, trees, and graphs.

The Curriculum's Introduction to Computer Science notes: "Students who already know basic programming in any language can skip this first course". That section's first course is Python For Everybody. Since its Chapters 7-11 are about data structures, shouldn't it be discussed as a curricular option in this RFC?

P.S. I support this RFC, but the lack of official solutions for this RFC's proposed course would be problematic for learners.

I believe that the fundamental ideas behind algorithms are the same. So, either Java, C++, or possibly Python is good.

Came across this RFC by browsing

If you are searching for DSA course in python https://nptel.ac.in/courses/106106145 by IIT Madras . This is also same course that IIT Madras offers in online data science degree that I am currently attending. https://onlinedegree.iitm.ac.in/course_pages/BSCCS2002.html