/data-science

:bar_chart: Path to a free self-taught education in Data Science!

Open Source Society logo

Open Source Society University

📊 Path to a free self-taught education in Data Science! Open Source Society University - Data Science

Contents

About

This is a solid path for those of you who want to complete a Data Science course on your own time, for free, with courses from the best universities in the World.

In our curriculum, we give preference to MOOC (Massive Open Online Course) style courses because these courses were created with our style of learning in mind.

Becoming an OSS student

To officially register for this course you must create a profile in our web app.

ps: Currently, the web app is for tracking the progress of the Computer Science path, but we are working to extend this functionality for all of our courses. Thanks for the comprehension.

"How can I do this?"

Just create an account on GitHub and log in with this account in our web app.

The intention of this app is to offer for our students a way to track their progress, and also the ability to show their progress through a public page for friends, family, employers, etc.

In the "My Progress" tab, you are able to edit the status of the courses that you are taking, and also add the link of your final project for each one.

Motivation & Preparation

Here are two interesting links that can make all the difference in your journey.

The first one is a motivational video that shows a guy that went through the "MIT Challenge", which consists of learning the entire 4-year MIT curriculum for Computer Science in 1 year.

The second link is a MOOC that will teach you learning techniques used by experts in art, music, literature, math, science, sports, and many other disciplines. These are fundamental abilities to succeed in our journey.

Are you ready to get started?

Curriculum


Linear Algebra

Courses Duration Effort
Linear Algebra - Foundations to Frontiers 15 weeks 8 hours/week
Applications of Linear Algebra Part 1 5 weeks 4 hours/week
Applications of Linear Algebra Part 2 4 weeks 5 hours/week

Single Variable Calculus

Courses Duration Effort
Calculus 1A: Differentiation 13 weeks 6-10 hours/week
Calculus 1B: Integration 13 weeks 5-10 hours/week
Calculus 1C: Coordinate Systems & Infinite Series 13 weeks 6-10 hours/week

Multivariable Calculus

Courses Duration Effort
MIT OCW Multivariable Calculus 15 weeks 8 hours/week

Python

Courses Duration Effort
Introduction to Computer Science and Programming Using Python 9 weeks 15 hours/week
Introduction to Computational Thinking and Data Science 10 weeks 15 hours/week
Introduction to Python for Data Science 6 weeks 2-4 hours/week
Programming with Python for Data Science 6 weeks 3-4 hours/week

Probability and Statistics

Courses Duration Effort
Introduction to Probability 16 weeks 12 hours/week
Statistical Reasoning - weeks - hours/week
Introduction to Statistics: Descriptive Statistics 5 weeks - hours/week
Introduction to Statistics: Probability 5 weeks - hours/week
Introduction to Statistics: Inference 5 weeks - hours/week

Introduction to Data Science

Courses Duration Effort
Introduction to Data Science 8 weeks 10-12 hours/week
Data Science - CS109 from Harvard 12 weeks 5-6 hours/week
The Analytics Edge 12 weeks 10-15 hours/week

Machine Learning

Courses Duration Effort
Learning From Data (Introductory Machine Learning) [caltech] 10 weeks 10-20 hours/week
Statistical Learning - weeks 3 hours/week
Stanford's Machine Learning Course - weeks 8-12 hours/week

Project

Complete Kaggle's Getting Started and Playground Competitions

Convex Optimization

Courses Duration Effort
Convex Optimization 9 weeks 10 hours/week

Data Wrangling

Courses Duration Effort
Data Wrangling with MongoDB 8 weeks 10 hours/week

Big Data

Courses Duration Effort
Intro to Hadoop and MapReduce 4 weeks 6 hours/week
Deploying a Hadoop Cluster 3 weeks 6 hours/week

Database

Courses Duration Effort
Stanford's Database course - weeks 8-12 hours/week

Natural Language Processing

Courses Duration Effort
Deep Learning for Natural Language Processing - weeks - hours/week

Deep Learning

Courses Duration Effort
Deep Learning 12 weeks 8-12 hours/week

Capstone Project

  • Participate in Kaggle competition
  • List down other ideas

Specializations

After finishing the courses above, start your specializations on the topics that you have more interest. You can view a list of available specializations here.

keep learning

How to use this guide

Order of the classes

This guide was developed to be consumed in a linear approach. What does this mean? That you should complete one course at a time.

The courses are already in the order that you should complete them. Just start in the Linear Algebra section and after finishing the first course, start the next one.

If the course isn't open, do it anyway with the resources from the previous class.

Should I take all courses?

Yes! The intention is to conclude all the courses listed here!

Duration of the project

It may take longer to complete all of the classes compared to a regular Data Science course, but I can guarantee you that your reward will be proportional to your motivation/dedication!

You must focus on your habit, and forget about goals. Try to invest 1 ~ 2 hours every day studying this curriculum. If you do this, inevitably you'll finish this curriculum.

See more about "Commit to a process, not a goal" here.

Project Based

Here in OSS University, you do not need to take exams, because we are focused on real projects!

In order to show for everyone that you successfully finished a course, you should create a real project.

"What does it mean?"

After finish a course, you should think about a real world problem that you can solve using the acquired knowledge in the course. You don't need to create a big project, but you must create something to validate and consolidate your knowledge, and also to show to the world that you are capable to create something useful with the concepts that you learned.

The projects of all students will be listed in this file. Submit your project's information in that file after you conclude it.

You can create this project alone or with other students!

Project Suggestions

And you should also...

Be creative!

This is a crucial part of your journey through all those courses.

You need to have in mind that what you are able to create with the concepts that you learned will be your certificate and this is what really matters!

In order to show that you really learned those things, you need to be creative!

Here are some tips about how you can do that:

  • Articles: create blog posts to synthesize/summarize what you learned.
  • GitHub repository: keep your course's files organized in a GH repository, so in that way other students can use it to study with your annotations.

Cooperative work

We love cooperative work! Use our channels to communicate with other fellows to combine and create new projects!

Which programming languages should I use?

Python and R are heavily used in Data Science community and our courses teach you both, but...

The important thing for each course is to internalize the core concepts and to be able to use them with whatever tool (programming language) that you wish.

Content Policy

You must share only files that you are allowed to! Do NOT disrespect the code of conduct that you signed in the beginning of some courses.

Be creative in order to show your progress! 😄

Stay tuned

Watch this repository for futures improvements and general information.

Prerequisite

The only things that you need to know are how to use Git and GitHub. Here are some resources to learn about them:

Note: Just pick one of the courses below to learn the basics. You will learn a lot more once you get started!

Change Log

How to collaborate

You can open an issue and give us your suggestions as to how we can improve this guide, or what we can do to improve the learning experience.

You can also fork this project and send a pull request to fix any mistakes that you have found.

TODO: If you want to suggest a new resource, send a pull request adding such resource to the extras section.

The extras section is a place where all of us will be able to submit interesting additional articles, books, courses and specializations, keeping our curriculum as immutable and concise as possible.

Let's do it together! =)

Community

Subscribe to /r/opensourcesociety!

Join us in our group!

You can also interact through GitHub issues.

We also have a chat room! Gitter

Add Open Source Society University to your Facebook profile!

ps: A forum is an ideal way to interact with other students as we do not lose important discussions, which usually occur in communication via chat apps. Please use our subreddit/group for important discussions.

Next Goals

Team

References