Data Science Resources

This repository contains resources and cheatsheets that should be helpful for anyone learning or practicing data science. Vast majority of the resources is geared towards Python users, but there's a page for R resources. There are translations of this page at the bottom. Please feel free to fork and open a pull request to add your translation!

  • Cheat Sheets - contains a lot of useful cheat sheets for Python, data analysis, machine learning, Git and more.

Programming Languages

  • Python - contains different Python guides, tips and tricks.
  • R - R resources, tutorials, guides, etc., starting with R for beginners through Data Analysis and Visualization to Machine Learning and Deep Learning with R.
  • SQL - SQL tutorials, cheat sheets, exercises, videos and courses.

Data Analysis and Visualization

  • Data Analysis - Exploratory Data Analysis guides, mostly with Pandas and NumPy.
  • Data Visualization - contains various data visualization guides - Pandas plotting, Matplotlib, Seaborn, Bokeh.

Machine Learning

  • Machine Learning - contains different machine learning guides: supervised learning (regression, classification, tree-based models etc), unsupervised learning (clustering), feature selection, model evaluation, etc.
  • Deep Learning - Deep Learning guides, including general guides and tutorials and resources organized by the type of network (CNN, RNN, etc.) and the library (TensorFlow, Keras, PyTorch, etc.).
  • Natural Language Processing - contains Natural Language Processing resources: NLTK, SKLearn NLP and more.

Statistics and Mathematics

  • Statistics - contains mostly theoretical reading to deepen your understanding of statistics.
  • Mathematics - contains resources for math topics that are relevant for data scientists.

Getting Data

  • Datasets - links to interesting datasets.

Development Environment

Translations